Elasticsearch add data, query

1 article summary

This article "Building a FAQ question the system," the second part, explain how to use Kibaba established in the ES index and add index data, and how to add bulk data. And related queries and search by keyword.
Environmental requirements:
ubuntu16, Elasticsearch7.1.0, Kibana7.1.0, head-elasticsearch

2 ES related operations

So the text is based on Dev Tools Kibana order to develop more convenient and faster than using the command-line development.

2.1 ES create, delete indexes

① new index, the easiest way:
Here Insert Picture Descriptioncommand Dev Tools in the left input, the right side Json returned ES information. New Index when the index name can not contain uppercase alphabetic characters, and can not be repeated to create.

② specified parameters to create an index
Here Insert Picture Descriptionnumber_of_shards and number_of_replicas are number of fragments and the number of copies
③ Delete Index
Here Insert Picture Description

2.2 add data to the index

Adding a test data to the index, since the purpose of this paper is to construct FAQ answering system, and there is only qustion anwser two properties.
Here Insert Picture DescriptionBulk add data:
First, in the following format requirements, build test.json file

{"index": {"_index": "test"}
{"question": "通过源码安装进行到第四步的时候空白", "anwser":"anwser1"}
{"index": {"_index": "test"}
{"question": "为什么windows一键安装包apache无法启动?", "anwser":"anwser2"}
{"index": {"_index": "test"}
{"question": "windows一键安装包默认的用户名和密码是什么?", "anwser":"anwser3"}
{"index": {"_index": "test"}}
{"question": "windows一键安装包无法开机自动启动", "anwser":"anwser4"}
{"index": {"_index": "test"}}
{"qustion": "安装的时候提示没有pdo扩展", "anwser":"anwser5"}

Open a terminal, enter the folder where the file test.json, execute the following command:

curl -XPOST "http://localhost:9200/_bulk?pretty" -H "Content-Type: application/json;charset=UTF-8" --data-binary @test.json

Browser and click on the name of the index test in elasticsearch-head plug page you can see the contents of bulk added:
Here Insert Picture Description

2.3 query and retrieval

在进行ES查询之前这里先介绍IK分词器插件,由于ES本身分词不支持中文的,本文主要介绍在中文领域内的相关应用,所有需要安装IK分词插件。在下面
https://github.com/medcl/elasticsearch-analysis-ik/releases 地址中下载ES对应版本的IK zip包,本文这里选择的是:
Here Insert Picture Description
将下载的zip包解压并进行重命名,并移动到elasticsearch-7.1.0/plugins

unzip elasticsearch-analysis-ik-7.1.0.zip -d analysis-ik
mv analysis-ik ******/elasticsearch-7.1.0/plugins

重启ES,在Kibana Dev Tools中输入相关命令进行分词,分词结果如下:
Here Insert Picture Description由上图可以得到,IK 分词器将王者荣耀分成“王者”和“荣耀”两个词语,但是在我们的理解中,“王者荣耀”是一款游戏名,应该分为一个词语,IK分词器也类似于结巴分词,可以添加自定义的词典。在elasticsearch-7.1.0/config/analysis-ik/custom 中新建new_word.dic文件,然后在文件中添加“王者荣耀”作为一行,再重启ES,重新进行分词:
Here Insert Picture Description同理,IK分词器还能够设置停用词,在elasticsearch-7.1.0/config/analysis-ik/stopword.dic 中添加停用词。IK分词器的其他用法还请自己查询;

查询:
回到建立索引的时候来,在前面的介绍中只是介绍了比较简单的创建方法,在FAQ问答的过程中都需要利用中文的分词,因此在建立索引的时候要制定分词的工具,类型等,如下所示:

PUT test 
{
  "settings": {
    "number_of_replicas": 1, 
    "number_of_shards": 5 
  }, 
  "mappings": { 
      "properties": { 
        "question":{ 
          "type":"text", 
          "analyzer": "ik_max_word",
          "search_analyzer": "ik_max_word" 
        }, 
        "anwser":{ 
          "type":"text" 
        }
      } 
  } 
}

And then create an index FAQ quantities of added data, specific operations introduced above, and then conducting inquiries:
Here Insert Picture DescriptionEnter need to think about the problem query "windows a key installation package can not automatically boot", the default will return 10 highest scoring data, related to more complex query methods match, term, bool is also related to the use of your own learning. ES returned after ten similar sentence, will later use the related art semantic matching to do an exact match, it returns the correct result.

Therefore, ES in the FAQ questions and answers related to the course just to play the role of the recall, the latter related to semantic matching algorithm of further detailed analysis.

Guess you like

Origin blog.csdn.net/JiKaTogether/article/details/90769747