Swiftorial Logo
Home
Swift Lessons
Matchups
CodeSnaps
Tutorials
Career
Resources

Rollups in Elasticsearch

Introduction to Rollups

Rollups are a feature in Elasticsearch that allow you to summarize and store historical data in a more efficient way. This is particularly useful for time-series data where the granularity of older data is not as important as more recent data. By rolling up data, you can reduce storage costs and improve query performance.

Creating a Rollup Job

To create a rollup job, you need to define a set of parameters that determine how the data should be summarized. These parameters include the index pattern, the time field, and the rollup interval.

Example of creating a rollup job:

PUT _rollup/job/sales_rollup
{
  "index_pattern": "sales-*",
  "rollup_index": "sales_rollup",
  "cron": "*/30 * * * * ?", 
  "page_size": 1000,
  "groups": {
    "date_histogram": {
      "field": "date",
      "fixed_interval": "1d"
    }
  },
  "metrics": [
    {
      "field": "price",
      "metrics": ["min", "max", "sum"]
    }
  ]
}

Understanding Rollup Indices

Rollup indices are special indices where the summarized data is stored. These indices are read-only and can be queried just like regular indices. When querying rollup indices, you can use the standard Elasticsearch query DSL, but with some limitations.

Example of querying a rollup index:

GET sales_rollup/_search
{
  "size": 0,
  "aggs": {
    "sales_over_time": {
      "date_histogram": {
        "field": "date",
        "interval": "1d"
      },
      "aggs": {
        "total_sales": {
          "sum": {
            "field": "price"
          }
        }
      }
    }
  }
}

Managing Rollup Jobs

Once a rollup job is created, you can start, stop, and delete it using the Elasticsearch API. You can also check the status of a rollup job to see if it is running or has completed.

Example of managing rollup jobs:

# Start a rollup job
POST _rollup/job/sales_rollup/_start

# Stop a rollup job
POST _rollup/job/sales_rollup/_stop

# Delete a rollup job
DELETE _rollup/job/sales_rollup

Best Practices

When using rollups, consider the following best practices:

  • Choose an appropriate rollup interval to balance between storage savings and query granularity.
  • Monitor rollup jobs to ensure they are running as expected and completing within a reasonable time frame.
  • Regularly review and delete old rollup jobs that are no longer needed to free up resources.

Conclusion

Rollups in Elasticsearch provide a powerful way to manage and query large volumes of time-series data efficiently. By summarizing historical data, you can save on storage costs and improve query performance, making it easier to gain insights from your data.