Quick links
- Overview and background
- What lookup runtime fields are used for
- How to implement lookup runtime fields
- Notes and good things to know
Overview and background
A runtime field is a field evaluated at query time instead of indexing time, which allows us to modify our schema at the query stage.
Runtime fields allow you to:
- Define fields for a particular use without changing the basic schema.
- Add fields to existing documents without reindexing your data.
- Begin to work with your data without recognizing how it is structured.
- Override the value brought back from an indexed field at query time.
Runtime fields are not indexed, meaning that the size of the index is not increased by adding a runtime field. In fact, they can increase the speed of ingestion, and lower the costs of storage. On the other hand, adding a runtime field decreases the query speed since a script is executed at runtime for every document in the result set.
Runtime fields are accessible from the search API in the same ways as other fields, and OpenSearch treats them the same way. They can be defined at index time or query time.
A lookup runtime field is a field in an index whose value is retrieved from another index. The lookup runtime field gives you the ability to create a relationship between documents in different indices.
What lookup runtime fields are used for
Lookup runtime fields can be used to enrich data during the query phase by fetching fields from related indices at the same time. In this way, you can easily enrich data that changes frequently and make informed decisions regarding when to update your primary index with additional data.
For many users, important data lives in separate indices. This is often due to the constantly changing nature of that data (such as daily metrics, security logging, etc). Using lookup runtime fields offers the ability to create beneficial connections between the dynamic data and the static data that users have access to, opening up advanced opportunities for analysis.
How to implement lookup runtime fields
Runtime fields with a type of lookup can retrieve field values from the associated indices using the fields parameter on the _search API.
First you need to define a runtime field in the main search request with a type of lookup, and you need to specify the following parameters:
- type: should be “lookup”.
- target_index: represents the index from which we want to retrieve field values, and against which the lookup query runs.
- input_field: represents the field on the primary index whose values are utilized as the lookup term query’s input values.
- target_field: represents the field that the lookup query searches against on the lookup target_index.
- fetch_fields: represents the fields that need to be retrieved from the lookup target_index.
In the example below, we illustrate the functionality of lookup runtime fields to join two indices where the target index contains data that changes considerably (though in the example, it is of course unlikely that authors would change their names frequently).
We have two indices, one for the authors that consist of three fields: the author’s first and last name and the book ID. The second index is for books and consists of two fields: the book ID and the book title.
We want to retrieve the value of author_name associated with the book ID that consists of the first_name and last_name fields from the authors index. As a result, we will have the book ID and title in addition to the full name of the author of this book.
POST authors/_doc?refresh { "book_id": "113606", "first_name": "Mark", "last_name": "Kim" }
PUT books/_doc/1?refresh { "id": "113606", "title": "machine learning" }
PUT books/_doc/2?refresh { "id": "142480", "title": "deep learning" }
POST books/_search { "runtime_mappings": { "author_name": { "type": "lookup", "target_index": "authors", "input_field": "id", "target_field": "book_id", "fetch_fields": ["first_name", "last_name"] } }, "fields": [ "id", "title", "author_name" ], "_source": false }
In the example, we defined a runtime field called author_name in the main search request with a type of lookup that retrieves fields (first_name and last_name) from the target index (authors) using the term queries.
The target index where the lookup query executes is the authors’ index that has the fields that need to be retrieved. The input field, which is ID, represents the field on the main index whose values are used as the input values of the lookup term query. The target field is book_id which represents the field on the lookup index which the lookup query searches against. The fetch fields which are first_name and last_name represent the fields that need to be retrieved from the lookup index.
The following hits returned from the above search:
{ "took": 3, "timed_out": false, "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 }, "hits": { "total": { "value": 2, "relation": "eq" }, "max_score": 1.0, "hits": [ { "_index": "books", "_id": "1", "_score": 1.0, "fields": { "id": [ "113606" ], "author_name": [ { "first_name": [ "Mark" ], "last_name": [ "Kim" ] } ], "title": [ "machine learning" ] } }, { "_index": "books", "_id": "2", "_score": 1.0, "fields": { "id": [ "142480" ], "title": [ "deep learning" ] } } ] } }
As shown in the returned hits, the above search returns the first_name and last_name from the authors index for each book ID of the returned search hits. To keep each document independent from the lookup index, the responses to lookup fields are aggregated. Each input value’s lookup query is anticipated to only match one lookup index document. If more than one document matches the lookup query, a random document will be chosen.
Notes and good things to know
- The capability to add fields to documents after they have been ingested is the main advantage of runtime fields.
- Runtime fields save disk space and give you more flexibility in the way you access your data, but depending on the calculations made in the runtime script, they may have an adverse effect on search performance.
- If the target index doesn’t change considerably, another better solution would be to leverage the enrich processor rather than use the runtime fields for enrichment.
- Lookup runtime fields are only available from ES 8.2 onwards and there’s currently no equivalence in OpenSearch.
- The size of the index is not increased by adding a runtime field because runtime fields are not indexed. You can reduce storage costs and accelerate data ingestion by directly defining runtime fields in the index mapping.
- When a dependent query is ongoing, updating or removing a runtime field could cause inconsistent results. Depending on when the mapping update takes place, each shard can have access to different copies of the script.
- If a runtime field is deleted or updated, it may break existing searches or visualizations in Kibana.