elasticsearch learning (three) - Kibana advanced queries and tokenizer

A, Kibana Advanced Search

1.1 Lite inquiry

The actual development is integrated by SpringBoot, integration is very simple, but the bottom of the inquiry also needs to know

(1) according to query id

GET /myJob/user/12

(2) the current query all types of documents

GET / myJob /user/_search

(3) a plurality of batch query ID

查询多个id分别为1、2
GET / myJob /user/_mget
{
  "ids":["1","2"]
  
}

(4) complex query conditions

查询年龄为年龄21岁
GET / myJob /user/_search?q=age:21

查询年龄30岁-60岁之间

GET / myJob /user/_search?q=age[30 TO 60]
注意:TO 一定要大写

查询年龄30岁-60岁之间 并且年龄降序、从0条数据到第1条数据
GET / myJob /user/_search?q=age[30 TO 60]&sort=age:desc&from=0&size=1

查询年龄30岁-60岁之间 并且年龄降序、从0条数据到第1条数据,展示name和age字段

GET / myJob /user/_search?q=age[30 TO 60]&sort=age:desc&from=0&size=1
&_source=name,age

1.2, Dsl language queries and filters (most used)

What is DSL language:

es query request in two ways, one is the simple version of the query. Another is to use a full JSON request body, called Structured Query (DSL), more commonly used .

DSL POST query is a json past , due to the post requests is json format , so there is a lot of flexibility, there are many forms:

(1) Precise query name by name

term represents the exact match, that is, no word is analyzed, the document must contain the entire search terms

GET myJob /user/_search
{
  "query": {
    "term": {
      "name": "xiaoming"
    }
    
  }
  
}

(2) fuzzy query according to car names

match the query is equivalent to fuzzy matching , it contains only part of keywords on the line

GET / myJob /user/_search
{
  "from": 0,
  "size": 2, 
  "query": {
    "match": {
      
        "car": "奥迪"
      }
  }
}

(3) Term and Match difference

Term queries will not be sub-word search field, it will use an exact match;

Match based tokenizer this field, carried word queries.

 

(4) the use of filter filtration Age

GET / myJob /user/_search
{
	"query": {
		"bool": {
			"must": [{
				"match_all": {}
			}],
			"filter": {
				"range": {
					"age": {
						"gt": 21,
						"lte": 51
					}
				}

			}

		}

	},
	"from": 0,
	"size": 10,
	"_source": ["name", "age"]

}

 

Second, the word is

What is the word breaker:

Because the ES in the default standard word is not very friendly to Chinese, Chinese words will be split into a number of Chinese characters, so to introduce Chinese word breaker es-ik plug

GitHub to download es-ik: Download: https://github.com/medcl/elasticsearch-analysis-ik/releases

Traditional word breaker presentation:

http://192.168.212.181:9200/_analyze
{
  "analyzer": "standard",
  "text": "奥迪a4l"
}

{
    "tokens": [
        {
            "token": "奥",
            "start_offset": 0,
            "end_offset": 1,
            "type": "<IDEOGRAPHIC>",
            "position": 0
        },
        {
            "token": "迪",
            "start_offset": 1,
            "end_offset": 2,
            "type": "<IDEOGRAPHIC>",
            "position": 1
        },
        {
            "token": "a4l",
            "start_offset": 2,
            "end_offset": 5,
            "type": "<ALPHANUM>",
            "position": 2
        }
    ]
}

 

Note: es-ik word plug-in version must be installed version and the corresponding es

Installation under Linux:

Step One: Download es of IK plug-in plug-ik named changed

Step two: Upload to /usr/local/elasticsearch-6.4.3/plugins

The third step: to restart elasticsearch 

 

Custom extension dictionary:

In /usr/local/elasticsearch-6.4.3/plugins/ik/config catalog:

vi custom/new_word.dic

Old iron

King of glory

Prehistoric force

Total property room

Along the way

Ha ha ha

we IKAnalyzer.cfg.xml

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
    <comment>IK Analyzer 扩展配置</comment>
    <!--用户可以在这里配置自己的扩展字典 -->
    <entry key="ext_dict">custom/new_word.dic</entry>
     <!--用户可以在这里配置自己的扩展停止词字典-->
    <entry key="ext_stopwords"></entry>
    <!--用户可以在这里配置远程扩展字典 -->
    <!-- <entry key="remote_ext_dict">words_location</entry> -->
    <!--用户可以在这里配置远程扩展停止词字典-->
    <!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>

 

Third, the document map

Index (index) is equivalent to a database, the type (type) corresponds to the data table, the mapping (the Mapping) corresponding to the table structure data table.

Document mapping is to specify the type of field to field in the document, tokenizer.

Use GET / hello / user / _mapping

1.1, the mapping of classification

(1) dynamic mapping

In a relational database, you need to create a database, then create a data table in the database instance before you can insert data in the data table;

However, in the ES does not require pre-defined mapping, according to the document will automatically recognize the type of document fields when writing ES , this mechanism more dynamic mapping.

The default type is long type

(2) static mapping

In the ES can also be defined first map, which contain various fields and their type of document, this mechanism is relatively static mapping.

1.2, ES Type Support

(1) Basic Type

String : String (containing text and keyword);

               text: The type is used to index the long text, the text will be word before creating the index, the conversion of the word combination.

Indexing: ES allows to retrieve these words, text type can not be used to sort and polymerization.

               keyword: it does not require this type of word that can be used to retrieve the filter, sort, polymerization.

(Note:  keyword type can not word, Text can type word query )

(Note: ES has the type (type) removed, are automatically recognize the type)

数指型:long、integer、short、byte、double、float

Date type: date

Boolean: boolean

Binary type: binary

Array types (Array datatype)

(2) complex type

Location Type ( Geo Datatypes ):

  • Type geographic coordinates (Geo-point datatype): geo_point latitude and longitude coordinates for
  • Geographical shape type (Geo-Shape datatype): geo_shape a complicated shape like a polygonal

A specific type ( Specialized Datatypes ):

  • Pv4 type (IPv4 datatype): ip IPv4 addresses
  • Completion type (Completion datatype): completion provides auto-complete suggestions
  • Token count type (Token count datatype): token_count used to do statistical sub-index number marks field, the value will always increase, not decrease because of the filter conditions
  • mapper-murmur3 Type: through the plug, a hash value index may be calculated by _murmur3_
  • Additional types (Attachment datatype): using the mapper-attachments plug supports _attachments_ index, for example, Microsoft office format, Open Documnet format, ePub, HTML, etc.

1.3 Example: Creating a document type and specify the type

POST /myJob/_mapping/user
{
  "user":{
    "properties":{
       "age":{
         "type":"integer"
       },
        "sex":{
         "type":"integer"
       },
       "name":{
         "type":"text",
         "analyzer":"ik_smart",
         "search_analyzer":"ik_smart"
       },
       "car":{
         "type":"keyword"
      
       }
    }
  }
  
}

 

 

发布了52 篇原创文章 · 获赞 116 · 访问量 5万+

Guess you like

Origin blog.csdn.net/RuiKe1400360107/article/details/103882339