Field type of cSearch's Mapping

es supports most data types in java: 

(1) core data types: 

(1) string: will be segmented by default, a complete example is as follows 

Java code   Favorite code
  1. "status": {  
  2.          "type" :   "string"//string type  
  3.          "index""analyzed" // word segmentation, regardless of word segmentation is: not_analyzed, set to no, the field will not be indexed  
  4.          "analyzer" : "ik" //Specify the tokenizer  
  5.          "boost" : 1.23 // field-level score weighting  
  6.           "doc_values" : false //For not_analyzed fields, they are enabled by default, and word segmentation fields cannot be used. Sorting and aggregation can improve performance and save memory  
  7.            "fielddata" :{ "format" : "disabled" } //For word segmentation fields, it can improve performance when participating in sorting or aggregation. It is recommended to use doc_value regardless of word segmentation fields  
  8.            "fields" :{ "raw" :{ "type" : "string" , "index" : "not_analyzed" }}  //You can provide multiple index modes for a field, the value of the same field, a token, a not Participle  
  9.            "ignore_above" : 100  //Text with more than 100 characters will be ignored and not indexed  
  10.            "include_in_all" :ture //Set whether this field is included in the _all field, the default is true, unless the index is set to the no option  
  11.            "index_options" : "docs" //4 optional parameters docs (index document number), freqs (document number + word frequency), positions (document number + word frequency + position, usually used for distance query), offsets (document number + Word frequency + position + offset, usually used in the highlight field) The default of the word segmentation field is position, and the default of others is docs  
  12.            "norms" :{ "enable" : true , "loading" : "lazy" } //The default configuration of word segmentation field, no word segmentation field: default {"enable": false}, boost when storing length factor and index, it is recommended to Participating in the use of scoring fields will increase memory consumption  
  13.             "null_value" : "NULL" //Set the initialization value of some missing fields, only strings can be used, and the null value of the word segmentation field will also be word segmentation  
  14.             "position_increament_gap" : 0 //Affects distance query or approximate query, which can be set on the data fire word segmentation field of the multi-value field, and the slop interval can be specified when querying, the default value is 100  
  15.              "store" : false //Whether this field is stored separately and separated from the _source field, the default is false, it can only be searched, and the value cannot be obtained  
  16.               "search_analyzer" : "ik" //Set the tokenizer when searching, which is the same as the ananlyzer by default. For example, use standard+ngram when indexing, and use standard when searching to complete the automatic prompt function  
  17.                "similarity" : "BM25" //The default is TF/IDF algorithm, specify a field scoring strategy, only valid for string type and word segmentation type  
  18.                "term_vector" : "no" //Vector information is not stored by default, supports parameters yes (term storage), with_positions (term + position), with_offsets (term + offset), with_positions_offsets (term + position + offset) for fast highlighting Fast vector highlighter can improve performance, but opening it will increase the index volume, which is not suitable for large data volumes.  
  19.        }  




(2) The main types of numbers are as follows: 
long: 64-bit storage 
integer: 32-bit storage 
short: 16-bit storage 
byte: 8-bit storage 
double: 64-bit double-precision storage 
float: 32-bit single-precision storage 

Support parameters: 

Java代码   Favorite code
  1. coerce:true/false 如果数据不是干净的,将自动会将字符串转成合适的数字类型,字符串会被强转成数字,浮点型会被转成整形,经纬度会被转换为标准类型  
  2. boost:索引时加权因子  
  3. doc_value:是否开启doc_value  
  4. ignore_malformed:false(错误的数字类型会报异常)true(将会忽略)  
  5. include_in_all:是否包含在_all字段中  
  6. index:not_analyzed默认不分词  
  7. null_value:默认替代的数字值  
  8. precision_step:16 额外存储对应的term,用来加快数值类型在执行范围查询时的性能,索引体积相对变大  
  9. store:是否存储具体的值  



(3)复合类型 

数组类型:没有明显的字段类型设置,任何一个字段的值,都可以被添加0个到多个,要求,他们的类型必须一致: 
对象类型:存储类似json具有层级的数据 
嵌套类型:支持数组类型的对象Aarray[Object],可层层嵌套 

(4)地理类型 

geo-point类型: 支持经纬度存储和距离范围检索 
geo-shape类型:支持任意图形范围的检索,例如矩形和平面多边形 

(5)专用类型 
ipv4类型:用来存储IP地址,es内部会转换成long存储 
completion类型:使用fst有限状态机来提供suggest前缀查询功能 
token_count类型:提供token级别的计数功能 
mapper-murmur3类型:安装sudo bin/plugin install mapper-size插件,可支持_size统计_source数据的大小 
附件类型:需要https://github.com/elastic/elasticsearch-mapper-attachments开源es插件支持,可存储office,html等类型 

(6)多值字段: 
一个字段的值,可以通过多种分词器存储,使用fields参数,支持大多数es数据类型 


(二)Mapping 参数列表,上面文章出现过的不再解释: 

序号 名称 解释
1 copy_to 与solr里面的copy_field字段功能一样,支持拷贝某个字段的值到集中的一个字段里面
2 properties mapping type, object fields and nested fields can contain subfields, and these properties can be added, examples are as follows




 


Official website documentation: https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html#_multi_fields_2 

 

http://qindongliang.iteye.com/blog/2259541

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325968694&siteId=291194637