Elasticsearch installation and download tutorial

Elasticsearch installation and download tutorial

The role of Elasticsearch

  • Rediscan use内存加载数据并实现数据快速访问
  • MongoDBYes 在内存中存储类似对象的数据并实现数据的快速访问, the pursuit of speed in enterprise development is never-ending.
  • The content to be discussed below is also a NoSQLsolution, but its function is not to directly accelerate data reading and writing, but to accelerate data query , which is called ES technology .

​ES (Elasticsearch) is a distributed full-text search engine with a focus on full-text search.

So what is full-text search?

  • For example, if a user wants to buy a book, he searches with Java as the keyword. Whether it is in the title of the book, in the introduction of the book, or even the name of the author of the book, as long as it contains java, it will be returned to the user as the query result. The above process uses 全文搜索技术.
  • The search condition is no longer only used 对某一个字段进行比对, but in 一条数据中使用搜索条件去比对更多的字段, as long as it can be matched, it will be included in the query results, that is 全文搜索的目的. The ES technology is a technology that can achieve the above effects.

To achieve the effect of full-text search, 不可能使用数据库中like操作去进行比对,这种效率太低了.
ES has designed a brand-new idea to realize full-text search. The specific operation process is as follows:

  1. Check and divide all the text information of the data in the queried field into several words

    • For example, "People's Republic of China" will be split into three words, namely "China", "People", and "Republic". There is a professional term for this process called word segmentation. Different word segmentation strategies have different separation effects. Different word segmentation strategies are called word segmentation devices.
  2. Store the results obtained by word segmentation, corresponding to the id of each piece of data

    • For example, the value of the name item in the data with id 1 is "People's Republic of China", then after the word segmentation, "China" corresponds to id 1, "People" corresponds to id 1, and "Republic" corresponds to id 1

    • For example, the value of the name item in the data with id 2 is "People's Congress", then after the word segmentation, it will appear that "People" corresponds to id 2, "Representative" corresponds to id 2, and "Congress" corresponds to id 2

    • At this time, the following corresponding results will appear, and all documents can be segmented according to the above form. It should be noted that the process of word segmentation is not performed on only one field, but on every field participating in the query, and the final results are summarized in a table

      word segmentation result keywords corresponding id
      China 1
      people 1,2
      Republic 1
      represent 2
      convention 2
  3. When performing a query, if you enter "people" as the query condition, you can compare the data in the above table to get the id value 1, 2, and then you can get the query result data according to the id value.

  • In the above process, the keyword content of word segmentation results is different, and its function is somewhat similar to the index in the database, which is used to speed up data query
  • However, the index in the database is to add an index to a certain field, and the word segmentation result keyword here is not a complete field value, but only a part of the content in a field. And when the index is used, the entire piece of data is searched according to the index content. The word segmentation result keyword query in the full-text search does not get the whole piece of data, but the id of the data. If you want to obtain specific data, you need to query again, so here A brand-new name was given for this word segmentation result keyword, called inverted index .

Install

1. Download path

download path

2. File directory

  • bin目录: Contains all executable commands

  • config目录: Contains configuration files used by the ES server

  • jdk目录: This directory contains a complete jdk toolkit, version 17. When ES is upgraded, use the latest version of jdk to ensure that there will be no insufficient version support.

  • lib目录: Contains the dependent jar files that ES runs

  • logs目录: Contains all log files generated after ES runs

  • modules目录: Contains all functional modules in the ES software, and is also a jar package one by one. Different from the jar directory, the jar directory is the jar package that ES depends on during operation, and modules are the functional jar packages of ES software itself

  • plugins目录: Contains plugins installed by ES software, default is empty

start server

elasticsearch.bat

​ Double-click the elasticsearch.bat file to start the ES server. The default service port is 9200. If you see the following information through browser access, http://localhost:9200it is considered that the ES server starts normally.

{
  "name" : "CZBK-**********",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "j137DSswTPG8U4Yb-0T1Mg",
  "version" : {
    "number" : "7.16.2",
    "build_flavor" : "default",
    "build_type" : "zip",
    "build_hash" : "2b937c44140b6559905130a8650c64dbd0879cfb",
    "build_date" : "2021-12-18T19:42:46.604893745Z",
    "build_snapshot" : false,
    "lucene_version" : "8.10.1",
    "minimum_wire_compatibility_version" : "6.8.0",
    "minimum_index_compatibility_version" : "6.0.0-beta1"
  },
  "tagline" : "You Know, for Search"
}

basic operation

  • The data we want to query is stored in ES, but the format is different from the database storage data format.

  • In ES, we have to first 创建倒排索引, the function of this index is similar 数据库的表, and then add the data to the inverted index, 添加的数据称为文档. Therefore, in order to perform ES operations 先创建索引, 再添加文档, so that subsequent query operations can be performed.

  • To operate ES can Rest风格的请求be performed through , that is to say, an operation can be performed by sending a request. For example, operations such as creating an index and deleting an index can be performed in the form of sending a request.

  • 创建索引, books is the index name, the same below

    PUT请求		http://localhost:9200/books
    

    After sending the request, if you see the following information, the index is created successfully

    {
          
          
        "acknowledged": true,
        "shards_acknowledged": true,
        "index": "books"
    }
    

    Repeatedly creating an existing index will cause an error message, and the reason for the error is described in the reason attribute

    {
          
          
        "error": {
          
          
            "root_cause": [
                {
          
          
                    "type": "resource_already_exists_exception",
                    "reason": "index [books/VgC_XMVAQmedaiBNSgO2-w] already exists",
                    "index_uuid": "VgC_XMVAQmedaiBNSgO2-w",
                    "index": "books"
                }
            ],
            "type": "resource_already_exists_exception",
            "reason": "index [books/VgC_XMVAQmedaiBNSgO2-w] already exists",	# books索引已经存在
            "index_uuid": "VgC_XMVAQmedaiBNSgO2-w",
            "index": "book"
        },
        "status": 400
    }
    
  • query index

    GET请求		http://localhost:9200/books
    

    Query the index to get the relevant information of the index, as follows

    {
          
          
        "book": {
          
          
            "aliases": {
          
          },
            "mappings": {
          
          },
            "settings": {
          
          
                "index": {
          
          
                    "routing": {
          
          
                        "allocation": {
          
          
                            "include": {
          
          
                                "_tier_preference": "data_content"
                            }
                        }
                    },
                    "number_of_shards": "1",
                    "provided_name": "books",
                    "creation_date": "1645768584849",
                    "number_of_replicas": "1",
                    "uuid": "VgC_XMVAQmedaiBNSgO2-w",
                    "version": {
          
          
                        "created": "7160299"
                    }
                }
            }
        }
    }
    

    If you query an index that does not exist, an error message will be returned. For example, the information after querying the index named book is as follows

    {
          
          
        "error": {
          
          
            "root_cause": [
                {
          
          
                    "type": "index_not_found_exception",
                    "reason": "no such index [book]",
                    "resource.type": "index_or_alias",
                    "resource.id": "book",
                    "index_uuid": "_na_",
                    "index": "book"
                }
            ],
            "type": "index_not_found_exception",
            "reason": "no such index [book]",		# 没有book索引
            "resource.type": "index_or_alias",
            "resource.id": "book",
            "index_uuid": "_na_",
            "index": "book"
        },
        "status": 404
    }
    
  • delete index

    DELETE请求	http://localhost:9200/books
    

    After deleting all, give the delete result

    {
          
          
        "acknowledged": true
    }
    

    If it is deleted repeatedly, an error message will be given, and the specific cause of the error will also be described in the reason attribute

    {
        "error": {
            "root_cause": [
                {
                    "type": "index_not_found_exception",
                    "reason": "no such index [books]",
                    "resource.type": "index_or_alias",
                    "resource.id": "book",
                    "index_uuid": "_na_",
                    "index": "book"
                }
            ],
            "type": "index_not_found_exception",
            "reason": "no such index [books]",		# 没有books索引
            "resource.type": "index_or_alias",
            "resource.id": "book",
            "index_uuid": "_na_",
            "index": "book"
        },
        "status": 404
    }
    
  • 创建索引并指定分词器

  • The index created earlier is 未指定分词器yes, you can add request parameters and set the tokenizer when creating the index.

  • At present, the more popular word breaker in China is to IK分词器download the corresponding word breaker before using it, and then use it. IK tokenizer download address:https://github.com/medcl/elasticsearch-analysis-ik/releases

  • After downloading the word breaker, decompress it into the plugins directory of the ES installation directory. After installing the word breaker, you need to restart the ES server. Use the IK tokenizer to create an index format:

    PUT请求		http://localhost:9200/books
    
    请求参数如下(注意是json格式的参数)
    {
          
          
        "mappings":{
          
          							#定义mappings属性,替换创建索引时对应的mappings属性		
            "properties":{
          
          						#定义索引中包含的属性设置
                "id":{
          
          							#设置索引中包含id属性
                    "type":"keyword"			#当前属性可以被直接搜索
                },
                "name":{
          
          						#设置索引中包含name属性
                    "type":"text",              #当前属性是文本信息,参与分词  
                    "analyzer":"ik_max_word",   #使用IK分词器进行分词             
                    "copy_to":"all"				#分词结果拷贝到all属性中
                },
                "type":{
          
          
                    "type":"keyword"
                },
                "description":{
          
          
                    "type":"text",	                
                    "analyzer":"ik_max_word",                
                    "copy_to":"all"
                },
                "all":{
          
          							#定义属性,用来描述多个字段的分词结果集合,当前属性可以参与查询
                    "type":"text",	                
                    "analyzer":"ik_max_word"
                }
            }
        }
    }
    

    ​ After the creation is completed, the returned result is the same as the result of creating the index without using the tokenizer. At this time, you can observe the added request parameter mappings by viewing the index information and have entered the index attribute.

    {
          
          
        "books": {
          
          
            "aliases": {
          
          },
            "mappings": {
          
          						#mappings属性已经被替换
                "properties": {
          
          
                    "all": {
          
          
                        "type": "text",
                        "analyzer": "ik_max_word"
                    },
                    "description": {
          
          
                        "type": "text",
                        "copy_to": [
                            "all"
                        ],
                        "analyzer": "ik_max_word"
                    },
                    "id": {
          
          
                        "type": "keyword"
                    },
                    "name": {
          
          
                        "type": "text",
                        "copy_to": [
                            "all"
                        ],
                        "analyzer": "ik_max_word"
                    },
                    "type": {
          
          
                        "type": "keyword"
                    }
                }
            },
            "settings": {
          
          
                "index": {
          
          
                    "routing": {
          
          
                        "allocation": {
          
          
                            "include": {
          
          
                                "_tier_preference": "data_content"
                            }
                        }
                    },
                    "number_of_shards": "1",
                    "provided_name": "books",
                    "creation_date": "1645769809521",
                    "number_of_replicas": "1",
                    "uuid": "DohYKvr_SZO4KRGmbZYmTQ",
                    "version": {
          
          
                        "created": "7160299"
                    }
                }
            }
        }
    }
    

At present, we already have an index, but there is no data in the index, so we need to add data first. Data is called a document in ES, and the document operation will be performed below.

  • 添加文档,有三种方式

    POST请求	http://localhost:9200/books/_doc		#使用系统生成id
    POST请求	http://localhost:9200/books/_create/1	#使用指定id
    POST请求	http://localhost:9200/books/_doc/1		#使用指定id,不存在创建,存在更新(版本递增)
    
    文档通过请求参数传递,数据格式json
    {
          
          
        "name":"springboot",
        "type":"springboot",
        "description":"springboot"
    }  
    
  • query document

    GET请求	http://localhost:9200/books/_doc/1		 #查询单个文档 		
    GET请求	http://localhost:9200/books/_search		 #查询全部文档
    
  • conditional query

    GET请求	http://localhost:9200/books/_search?q=name:springboot	# q=查询属性名:查询属性值
    
  • delete document

    DELETE请求	http://localhost:9200/books/_doc/1
    
  • Modify the document (full update)

    PUT请求	http://localhost:9200/books/_doc/1
    
    文档通过请求参数传递,数据格式json
    {
          
          
        "name":"springboot",
        "type":"springboot",
        "description":"springboot"
    }
    
  • Revised docs (partially updated)

    POST请求	http://localhost:9200/books/_update/1
    
    文档通过请求参数传递,数据格式json
    {
          
          			
        "doc":{
          
          						#部分更新并不是对原始文档进行更新,而是对原始文档对象中的doc属性中的指定属性更新
            "name":"springboot"		#仅更新提供的属性值,未提供的属性值不参与更新操作
        }
    }
    

[Reference link]:

Guess you like

Origin blog.csdn.net/weixin_45428910/article/details/127707716