Bucket Aggregations in Elasticsearch
Introduction
Bucket aggregations in Elasticsearch allow you to group documents into buckets, which are collections of documents that meet a specified criterion. These aggregations are useful for summarizing and analyzing large datasets, as they enable you to categorize data based on certain fields or criteria.
Types of Bucket Aggregations
Elasticsearch supports various types of bucket aggregations. Here are a few commonly used ones:
- Terms Aggregation: Groups documents based on unique values of a specified field.
- Date Histogram Aggregation: Groups documents into buckets based on date intervals.
- Range Aggregation: Groups documents into buckets based on specified ranges.
- Filter Aggregation: Groups documents that match a specified filter.
Terms Aggregation
The terms aggregation is used to group documents by unique values of a specified field. This is useful for scenarios like finding the most common terms in a dataset.
Example Query:
{ "size": 0, "aggs": { "popular_colors": { "terms": { "field": "color" } } } }
Example Output:
{ "aggregations": { "popular_colors": { "buckets": [ { "key": "red", "doc_count": 10 }, { "key": "blue", "doc_count": 7 }, { "key": "green", "doc_count": 5 } ] } } }
Date Histogram Aggregation
The date histogram aggregation is used to group documents into buckets based on date intervals. This is useful for analyzing trends over time.
Example Query:
{ "size": 0, "aggs": { "sales_over_time": { "date_histogram": { "field": "sale_date", "calendar_interval": "month" } } } }
Example Output:
{ "aggregations": { "sales_over_time": { "buckets": [ { "key_as_string": "2023-01-01T00:00:00.000Z", "doc_count": 15 }, { "key_as_string": "2023-02-01T00:00:00.000Z", "doc_count": 20 }, { "key_as_string": "2023-03-01T00:00:00.000Z", "doc_count": 25 } ] } } }
Range Aggregation
The range aggregation allows you to define ranges and group documents into buckets based on whether the field values fall into those ranges. This is useful for categorizing numeric data.
Example Query:
{ "size": 0, "aggs": { "price_ranges": { "range": { "field": "price", "ranges": [ { "to": 100 }, { "from": 100, "to": 200 }, { "from": 200 } ] } } } }
Example Output:
{ "aggregations": { "price_ranges": { "buckets": [ { "key": "*-100.0", "doc_count": 8 }, { "key": "100.0-200.0", "doc_count": 12 }, { "key": "200.0-*", "doc_count": 5 } ] } } }
Filter Aggregation
The filter aggregation allows you to create a bucket of documents that match a specified filter. This is useful for isolating a subset of data based on certain criteria.
Example Query:
{ "size": 0, "aggs": { "red_products": { "filter": { "term": { "color": "red" } } } } }
Example Output:
{ "aggregations": { "red_products": { "doc_count": 10 } } }