Field Data Types in Elasticsearch
Introduction
In Elasticsearch, fields in documents can have different data types. Properly configuring field data types is crucial for efficient storage, indexing, and querying of your data. This guide covers the various field data types available in Elasticsearch, along with their properties and examples.
1. String Data Types
Elasticsearch provides two main string data types:
- Text: Used for full-text search. It is analyzed and tokenized.
- Keyword: Used for structured search. It is not analyzed and is stored as is.
Example:
{ "mappings": { "properties": { "title": { "type": "text" }, "tags": { "type": "keyword" } } } }
2. Numeric Data Types
Numeric data types are used to store numerical values. They include:
- Integer: For whole numbers.
- Long: For larger whole numbers.
- Float: For single-precision floating-point numbers.
- Double: For double-precision floating-point numbers.
- Short: For small whole numbers.
- Byte: For very small whole numbers.
Example:
{ "mappings": { "properties": { "age": { "type": "integer" }, "price": { "type": "float" } } } }
3. Date Data Types
Date data types are used to store dates and times. Elasticsearch supports various date formats, including:
- Strict date optional time: The default format, e.g.,
yyyy-MM-dd'T'HH:mm:ss.SSSZ
. - Custom formats: You can define custom date formats using patterns.
Example:
{ "mappings": { "properties": { "created_at": { "type": "date" } } } }
You can also specify a custom format:
{ "mappings": { "properties": { "created_at": { "type": "date", "format": "yyyy/MM/dd HH:mm:ss" } } } }
4. Boolean Data Type
The boolean data type is used to store true
or false
values.
Example:
{ "mappings": { "properties": { "is_active": { "type": "boolean" } } } }
5. Binary Data Type
The binary data type is used to store binary data encoded as Base64 strings.
Example:
{ "mappings": { "properties": { "file_data": { "type": "binary" } } } }
6. Range Data Types
Range data types are used to store ranges of values. They include:
- Integer Range: For ranges of integer values.
- Float Range: For ranges of float values.
- Long Range: For ranges of long values.
- Double Range: For ranges of double values.
- Date Range: For ranges of dates.
Example:
{ "mappings": { "properties": { "price_range": { "type": "float_range" } } } }
7. Object Data Type
The object data type is used to store JSON objects. Fields within the object are indexed as separate fields.
Example:
{ "mappings": { "properties": { "address": { "type": "object", "properties": { "street": { "type": "text" }, "city": { "type": "text" } } } } } }
8. Nested Data Type
The nested data type is similar to the object data type but allows for querying nested objects independently.
Example:
{ "mappings": { "properties": { "comments": { "type": "nested", "properties": { "user": { "type": "text" }, "message": { "type": "text" } } } } } }
9. Geo Data Types
Geo data types are used to store geographical data. They include:
- Geo-point: For storing latitude and longitude.
- Geo-shape: For storing complex shapes like polygons.
Example (Geo-point):
{ "mappings": { "properties": { "location": { "type": "geo_point" } } } }
Example (Geo-shape):
{ "mappings": { "properties": { "area": { "type": "geo_shape" } } } }
10. IP Data Type
The IP data type is used to store IPv4 and IPv6 addresses.
Example:
{ "mappings": { "properties": { "ip_address": { "type": "ip" } } } }
Conclusion
Understanding and correctly using field data types in Elasticsearch is essential for designing efficient and effective search solutions. This guide covered the major data types and provided examples for each. Properly mapping your data will ensure optimal performance and accuracy in your Elasticsearch queries.