Briefly, this error occurs when Elasticsearch fails to generate a mapping source due to incorrect syntax or invalid data types. This could be due to a misconfiguration in the mapping settings or a mismatch between the data type and the field type. To resolve this issue, you should first check the syntax of your mapping source. Ensure that all fields have the correct data types and that there are no missing or extra commas or brackets. If the error persists, try to simplify your mapping source by removing unnecessary fields or reducing its complexity.
This guide will help you check for common problems that cause the log ” Failed to generate [” + mappingSource + “] ” to appear. To understand the issues related to this log, read the explanation below about the following Elasticsearch concepts: admin, mapping, source, indices.
Overview
Mapping is similar to database schemas that define the properties of each field in the index. These properties may contain the data type of each field and how fields are going to be tokenized and indexed. In addition, the mapping may also contain various advanced level properties for each field to define the options exposed by Lucene and Elasticsearch.
You can create a mapping of an index using the _mappings REST endpoint. The very first time Elasticsearch finds a new field whose mapping is not pre-defined inside the index, it automatically tries to guess the data type and analyzer of that field and set its default value. For example, if you index an integer field without pre-defining the mapping, Elasticsearch sets the mapping of that field as long.
Examples
Create an index with predefined mapping:
PUT /my_index?pretty { "settings": { "number_of_shards": 1 }, "mappings": { "properties": { "name": { "type": "text" }, "age": { "type": "integer" } } } }
Create mapping in an existing index:
PUT /my_index/_mapping?pretty { "properties": { "email": { "type": "keyword" } } }
View the mapping of an existing index:
GET my_index/_mapping?pretty
View the mapping of an existing field:
GET /my_index/_mapping/field/name?pretty
Notes
- It is not possible to update the mapping of an existing field. If the mapping is set to the wrong type, re-creating the index with updated mapping and re-indexing is the only option available.
- In version 7.0, Elasticsearch has deprecated the document type and the default document type is set to _doc. In future versions of Elasticsearch, the document type will be removed completely.
How to optimize your Elasticsearch mapping to reduce costs
Watch the video below to learn how to save money on your deployment by optimizing your mapping.
Common problems
- The most common problem in Elasticsearch is incorrectly defined mapping which limits the functionality of the field. For example, if the data type of a string field is set as text, you cannot use that field for aggregations, sorting or exact match filters. Similarly, if a string field is dynamically indexed without predefined mapping, Elasticsearch automatically creates two fields internally. One as a text type for full-text search and another as keyword type, which in most cases is a waste of space.
- Elasticsearch automatically creates an _all field inside the mapping and copies values of each field of a document inside the _all field. This field is used to search text without specifying a field name. Make sure to disable the _all field in production environments to avoid wasting space. Please note that support for the _all field has been removed in version 7.0.
- In versions lower than 5.0, it was possible to create multiple document types inside an index, similar to creating multiple tables inside a database. In those versions, there were higher chances of getting data types conflicts across different document types if they contained the same field name with different data types.
- The mapping of each index is part of the cluster state and is managed by master nodes. If the mapping is too big, meaning there are thousands of fields in the index, the cluster state grows too large to be handled and creates the issue of mapping explosion, resulting in the slowness of the cluster.
Overview
When a document is sent for indexing, Elasticsearch indexes all the fields in the format of an inverted index, but it also keeps the original JSON document in a special field called _source.
Examples
Disabling source field in the index:
PUT /api-logs?pretty { "mappings": { "_source": { "enabled": false } } }
Store only selected fields as a part of _source field:
PUT api-logs { "mappings": { "_source": { "includes": [ "*.count", "error_info.*" ], "excludes": [ "error_info.traceback_message" ] } } }
Including only selected fields using source filtering:
GET api-logs/_search { "query": { "match_all": {} }, "_source": { "includes": ["api_name","status_code", "*id"] } }
Notes
The source field brings an overhead of extra storage space but serves special purposes such as:
- Return as a part of the response when a search query is executed.
- Used for reindexing purpose, update and update_by_query operations.
- Used for highlighting, if the field is not stored, it means the field is not set as “store to true” inside the mapping.
- Allows selection of fields to be returned.
The only concern with source field is the extra storage usage on disk. But this storage space used by source field can be optimized by changing compression level to best_compression. This setting is done using index.codec parameter.
Log Context
Log “Failed to generate [” + mappingSource + “]” class name is PutMappingRequest.java. We extracted the following from Elasticsearch source code for those seeking an in-depth context :
try { XContentBuilder builder = XContentFactory.contentBuilder(XContentType.JSON); builder.map(mappingSource); return source(BytesReference.bytes(builder); builder.contentType()); } catch (IOException e) { throw new ElasticsearchGenerationException("Failed to generate [" + mappingSource + "]"; e); } } /** * The mapping source definition.
[ratemypost]