elasticsearch(三) mapping之type,anlayzer

1.造数据

POST my_index/my_type/1
{
  "group":"高富帅",
  "id": "1234",
  "sex": 1,
  "attribute": "eating,car,girls",
  "birthday": "1900-12-10",
  "lat_lng": "31.2427760000,121.4903420000"
}

第1位高富帅,来自南京西路东路地铁站

POST my_index/my_type/2
{
  "group":"白富美",
  "id": "2234",
  "sex": 2,
  "attribute": "eat,dog,boys",
  "birthday": "1989-12-10",
  "lat_lng": "31.2433470000,121.5087220000"
}

第2位白富美来自陆家嘴

PUT my_index/my_type/3
{
  "group":"小妹妹",
  "id": "3234",
  "sex": 2,
  "attribute": "eat dog boy flowers",
  "birthday": "2010-12-10",
  "lat_lng": "31.2257000000,121.5508340000"
}

第3位小妹妹来自世纪大道

PUT my_index/my_type/4
{
  "group":"空姐",
  "id": "4234",
  "sex": 2,
  "attribute": "eat,dog,girl",
  "birthday": "1995-12-10",
  "lat_lng": "31.1573860000,121.8150200000"
}

第4位空姐,来自浦东机场

2.mapping是啥?

看下刚刚我们建的索引的mapping长啥样

GET my_index/_mapping/my_type

response:

{
  "my_index": {
    "mappings": {
      "my_type": {
        "properties": {
          "attribute": { "type": "string" },
          "birthday": { "type": "date", "format": "strict_date_optional_time||epoch_millis" },
          "group": { "type": "string" },
          "id": { "type": "string" },
          "lat_lng": { "type": "string" },
          "sex": { "type": "long" } }
      }
    }
  }
}

很明显,mapping 定义了每个field的数据类型(用途之一),es很聪明,能自动确定类型:

“1900-12-10” -> date
1 -> long

然而,有些field要让它完全猜对我们的心思还是有些强人所难,比如:

“1234” -> string
“31.2427760000,121.4903420000” -> striing

我其实希望

“1234” - > int
“31.2427760000,121.4903420000” -> (维度,经度)

后面会提到怎样修改type,不过在此之前,先了解下es有哪些type

3.Field Types

• String:string
• Whole number: byte, short, integer, long(默认)
• Floating-point:float,double
• Boolean:boolean
• Date:date
• lat/lon points:geo_point

所有的type见:Field datatypes

指定type

要改变field的类型,必须先删掉之前的索引!

DELETE /my_index

再重新建

PUT /my_index
{
  "mappings": {
    "my_type": {
      "properties": {
        "id": {
          "type": "string" },
        "birthday": {
          "type": "date" },
        "sex": {
          "type": "short" },
        "attribute": {
          "type": "string" },
        "lat_lng": {
          "type": "geo_point" }
      }
    }
  }
}

检查下

GET my_index/_mapping/my_type
{
  "my_index": {
    "mappings": {
      "my_type": {
        "properties": {
          "attribute": { "type": "string" },
          "birthday": { "type": "date", "format": "strict_date_optional_time||epoch_millis" },
          "id": { "type": "string" },
          "lat_lng": { "type": "geo_point" },
          "sex": { "type": "short" } }
      }
    }
  }
}

再将之前的4个doc全塞进去!
很好,type就是我们想要的啦,有啥用,目前的app很多有附近搜索,我们也来小试牛刀下

位置搜索

GET my_index/my_type/_search
{
  "query": {
    "geo_distance": {
      "distance": "4km",
      "lat_lng": "31.2393950000,121.4837130000"
    }
  }
}

lat_lng:搜索的中心,我这里用的是人民广场
distance:搜索半径,我这里设为4km,单位可以是m,有兴趣的可以了解下geohash,就大概知道为啥能这么快实现啦

当然,还可以指定区域搜索,更多精彩内容见:Geo Distance QueryGeo Location and Search

4.Analyzers

ref:Analysis and Analyzers
上一节已经稍稍讲过analyzer,总之,建索引的时候,es会按每个field配置的analyzer分析field值,用来建倒排索引(Inverted Index),搜索的时候,也会按搜索字段的analyzer分析查询值。
和type类似的思路,先了解下es有哪些analyzer,接着指定analyzer。

Analyzers

介绍2个简单的,自己运行理解下吧
- whitespace(空格)

 GET /_analyze?analyzer=whitespace
{
  "text":"full-text books, tired  sleeping"
}
  • english
 GET /_analyze?analyzer=english
{
  "text":"full-text books, tired  sleeping"
}

分词,词干化后剩下:full,text,book,tire,sleep
其它Built-in Analyzers analyzer,自定义analyzer见:Analyzers

指定analyzer

同样要先删掉之前的索引

DELETE /my_index
PUT /my_index
{
  "mappings": {
    "my_type": {
      "properties": {
        "id": {
          "index": "no",
          "type": "string" },
        "birthday": {
          "index": "not_analyzed",
          "type": "date" },
        "sex": {
          "index": "not_analyzed",
          "type": "short" },
        "attribute": {
          "index": "analyzed",
          "analyzer": "whitespace",
          "type": "string" },
        "lat_lng": {
          "type": "geo_point" }
      }
    }
  }
}

index:控制fiel是否被索引,是否要分析

no:指定field不参与建索引,当然也无法搜索该字段
analyzed:分析字段(缺省时,默认analyzed)
not_analyzed:不分析字段

analyzer:控制怎样被索引(缺省时,默认 standard

_all field

补充一个可以搜索所有字段的内容,_all field,要去找房子啦o(╯□╰)o,自己看看吧

 "_all": {
      "enabled": false
    }

猜你喜欢

转载自blog.csdn.net/soidnhp/article/details/53444060