Basic operations of Elasticsearch
Article directory
I. Overview
Elasticsearch, referred to as ES, is an open source, highly scalable, RESTful-style distributed full-text search engine. It can store and retrieve data in near real-time. It has good scalability and can be extended to hundreds of servers, processing PB level data.
The full-text search engine mentioned here refers to the mainstream search engine that is widely used at present. Its working principle is that the computer indexing program scans each word in the article and establishes an index for each word , indicating that the word appears in the article. When the user queries, the retrieval program searches according to the index established in advance, and feeds back the search results to the user's retrieval method. This process is similar to the process of looking up words through a search word list in a dictionary.
Elasticsearch is a document-oriented database. A piece of data is a document. The comparison between the document data stored in Elasticsearch and the data stored in MySQL:
Note: The concept of Type in Elasticsearch has been removed.
1.1 Forward and Inverted Indexes
Positive index:
MySQL uses a forward index, such as querying the value of a corresponding field in a record based on the primary key index. However, if a fuzzy query (querying a certain part of a certain field) can only locate the complete field value according to the index, this kind of index may fail.
Inverted index:
ES uses an inverted index, which performs word segmentation and dismantling operations on the article, and establishes an index for each word. According to a specific word, the corresponding index value can be queried, and the entire article content can be queried according to this index value. , which is the inverted index.
Second, the installation program
-
Download address: Past Releases of Elastic Stack Software | Elastic (This article uses the Win format of version 7.8.0)
-
Unzip the compressed package to get the following directory structure:
-
After decompression, enter the bin directory and click the elasticsearch.bat file to start the ES service:
Note: Port 9300 is the communication port between Elasticsearch cluster components, and port 9200 is the HTTP protocol RESTful port accessed by browsers.
-
Open the browser, enter the URL: http://localhost:9200/, the following interface appears, indicating that the startup is successful:
-
In order to easily use RESTful style requests, install Postman software, download address: https://www.getpostman.com/apps
3. HTTP operation
3.1 Index operation
3.1.1 Create Index
In contrast to relational databases, creating an index is equivalent to creating a database.
PUT request to create index : http://127.0.0.1:9200/索引名称
:
# 响应结果,true表示成功
"acknowledged": true,
# 分片操作成功
"shards_acknowledged": true,
# 创建的索引名称
"index": "shopping"
If the same index is added repeatedly, an error message will be returned:
3.1.2 View Index
View all indexes:
See all indexed GET requests: http://127.0.0.1:9200/_cat/indices?v
:
- _cat: means to view
- indices: Indicates all indices
- ?v: indicates that the results are displayed in the form of a table
The details of the response result are as follows:
View a single index:
View GET requests for a single index : http://127.0.0.1:9200/索引名
:
Analysis of the response result:
# 索引名
shopping
# 别名
aliases
# 映射
mappings
# 设置
settings
# 设置-索引
settings-index
# 设置-索引-创建时间
settings-index-creation_date
# 设置-索引-主分片数量
settings-index-number_of_shards
# 设置-索引-副分片数量
settings-index-number_of_replicas
# 设置-索引-唯一标识
settings-index-uuid
# 设置-索引-版本
settings-index-version
# 设置-索引-名称
settings-index-provided_name
3.1.3 Deleting an index
DELETE request to delete an index: http://127.0.0.1:9200/索引名
:
Accessing the index again, the response index does not exist:
3.2 Document Operations
3.2.1 Creating documents
Randomly generate id values:
Creating a document is equivalent to creating a record of a table, which is in JSON format.
The POST request to create the document is: http://127.0.0.1:9200/索引名/_doc + JSON格式请求体
:
- _doc: Indicates the document:
Note: Since the PUT operation is idempotent, if the same request is issued multiple times, the latter one will overwrite the previous one, and the _id value returned by the created document is different, so it needs to be created using POST, and PUT is generally used for Change resources.
Generate a fixed id value:
The POST request to create the document is: http://127.0.0.1:9200/索引名/_doc/id值 + JSON格式请求体
:
Note: If the data primary key (id) value is specified when the document is created, the request method can also use PUT.
3.2.2 View Documentation
When viewing a document, you need to specify the unique ID of the document, similar to MySQL's primary key query.
GET requests to query documents: http://127.0.0.1:9200/索引名/_doc/文档主键值
:
3.2.3 Modify the document
Modify the entire document
A POST request that modifies the entire document: http://127.0.0.1:9200/索引名/_doc/文档主键值 + JSON请求体
:
Modify document section fields
POST request to modify the fields of the document section: http://127.0.0.1:9200/索引值/_update/文档主键值 + JSON格式请求体
:
3.2.4 Deleting documents
Deleting a document does not immediately delete it from disk, it is simply marked as deleted (tombstone).
DELETE request to delete a document: http://127.0.0.1:9200/索引值/_doc/文档主键值
:
3.3 Multiple query methods
3.3.1 Conditional query
After querying all the results, filter them to display only documents that match the specified field value.
method one:
Send a GET request:http://127.0.0.1:9200/索引值/_search?q=字段名:字段值
Method two:
Send a GET request:http://127.0.0.1:9200/索引值/_search + JSON格式请求体
3.3.2 Full query
Query all documents under an index.
method one:
Send a GET request:http://127.0.0.1:9200/索引名/_search
Method two:
Send a GET request:http://127.0.0.1:9200/索引名/_search + JSON格式请求体
3.3.3 Paging query
Paginate query results.
Paginated results:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": null,
"hits": [
{
"_index": "shopping",
"_type": "_doc",
"_id": "86XdrX0BhPod2Sb5T3VE",
"_score": null,
"_source": {
// 仅指定的title字段被显示出来
"title": "小米手机"
},
"sort": [
3999.0
]
},
{
"_index": "shopping",
"_type": "_doc",
"_id": "XndgzH0BZ5WKFcQMlsuM",
"_score": null,
"_source": {
"title": "华为手机"
},
"sort": [
2503.36
]
}
]
}
}
3.3.4 Multi-condition query
Case 1: And the condition
Case 2: OR Condition
3.3.5 Range query
Notice:
3.3.6 Full-text search
Case 1:
Sometimes when querying, the complete value of the input field is not entered, and the result can also be queried, for example:
Reason: When saving a document, ES will perform word segmentation and dismantling operations on the data text, and save the dismantling results in the inverted index, so that even if part of the text is used, the corresponding results can be queried.
Case two:
Enter the combination structure of different field values in the two documents. The value itself does not exist in the fields of any document, but it will query all the two documents. as follows:
The reason for the appearance: ES will also perform word segmentation and dismantling operation on the query conditions, which are divided into two parts: "small" and "hua". The two words correspond to case 1 (matching the inverted index), so the two documents will be divided into two parts. All inquired.
3.3.7 Exact match
If you don't want to have a partial match in the full-text search, but want to completely match the query value, you need to use the keyword match_phrase
:
3.3.8 Aggregate query
The query results can be grouped, averaged, and maximized.
Example of grouping operation:
search result:
"aggregations": {
//自定义分组结果名称
"price_group": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
//表示价格为2503.3的文档个数为2个
{
"key": 2503.3,
"doc_count": 2
},
//表示价格为3999.0的文档个数为1个
{
"key": 3999.0,
"doc_count": 1
}
]
}
}
Example of averaging operation:
Request body:
{
//agg表示聚合操作
"aggs" : {
//自定义名称,作为结果的名称
"price_avg" : {
//avg表示求平均值操作
"avg" : {
//表示对所有文档的price字段求平均值
"field" : "price"
}
}
},
//仅显示平均值结果,而不显示所有的文档具体内容
"size" : 0
}
search result:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 3,
"relation": "eq"
},
"max_score": null,
//不显示具体的文档内容,如果不使用size=0,则这里会显示所有的文档具体内容
"hits": []
},
"aggregations": {
//自定义名称
"price_avg": {
//平均值结果
"value": 3001.90673828125
}
}
}
3.3.9 Mapping relationship
The text type can perform partial matching for full-text search, while the KV type can only perform full matching.
1. Create the user index
http://127.0.0.1:9200/user
2. Define the mapping type of the field
Request sent:
Carrying request body:
{
//设置映射关系
"properties" : {
//name字段的类型是文本,而且此字段可以使用索引
"name" : {
"type" : "text",
"index" : true
},
//sex字段的类型是K-V,而且此字段可以使用索引
"sex" : {
"type" : "keyword",
"index" : true
},
//sex字段的类型是K-V,而且此字段不能使用索引
"tel" : {
"type" : "keyword",
"index" : false
}
}
}
3. Create documentation
4. Query field value
Case 1:
Case two:
Case three: