Table of contents
- foreword
- 1. What is ElasticSearch?
- 2. Installation
- 3. Core concepts
-
- 3.1 Simple interaction between kibana and es
- 3.2 Advanced query
-
- 3.2.1 Query all [match_all]
- 3.2.2 Keyword query (term)
- 3.2.3 Range query [range]
- 3.2.4 Prefix query [prefix]
- 3.2.5 Wildcard query [wildcard]
- 3.2.6 Multi-id query [ids]
- 3.2.7 Fuzzy query [fuzzy]
- 3.2.8 Boolean query [bool]
- 3.2.9 Multi-field query [multi_match]
- 3.2.10 Default field word segmentation query [query_string]
- 3.2.11 Highlight query [highlight]
- 3.2.12 Return the specified number [size]
- 3.2.13 Paging query [form]
- 3.2.14 Specify field sorting [sort]
- 3.2.15 Return the specified field [_source]
- 3.3 Index Principle
- 3.4 Filter query "Filter Query"
- 4. Integrate springboot
- Summarize
foreword
Elasticsearch is a real-time distributed search analytics engine that lets you explore your data at unprecedented speed and scale.
Apart from anything else, the address of the warehouse .
1. What is ElasticSearch?
ElasticSearch
The short ES
, is Apache Lucene
build-based 开源搜索引擎
, and is currently the most popular 企业级搜索引擎
. Lucene本身就可以被认为迄今为止性能最好的一款开源搜索引擎工具包
, but lucene's API is relatively complex and requires deep search theory. It is difficult to integrate into practical applications. ES是采用java语言编写,提供了简单易用的RestFul API,开发者可以使用其简单的RestFul API,开发相关的搜索功能,从而避免lucene的复杂性
.
2. Installation
Download address
https://www.elastic.co/cn/downloads/elasticsearch
2.1 docker installation es
1. Create a folder
-p 若建立目录的上层目录目前尚未建立,则会一并建立上层目录
mkdir -p ~/es/data ~/es/plugins
authorized
chmod 777 ~/es/data ~/es/plugins
2. docker start
docker run -d --name es -p 9200:9200 -p 9300:9300 -v ~/es/data:/usr/share/elasticsearch/data -v ~/es/plugins:/usr/share/elasticsearch/plugins -e ES_JAVA_OPTS="-Xms256m -Xmx256m" -e "discovery.type=single-node" elasticsearch:7.14.0
3. Visit ES
http://127.0.0.1:9200/
2.2 Kibana
Kibana Navicat
It is a target Elasticsearch mysql
, 开源分析及可视化平台
you can use Kibana 查询、查看并与存储在ES索引的数据进行交互操作
, you can use Kibana to perform advanced数据分析,并能以图表、表格和地图的形式查看数据。
Download address
https://www.elastic.co/cn/downloads/kibana
2.2.1 docker installation Kibana
1. Create a folder
mkdir -p ~/kibana/data ~/kibana/plugins ~/kibana/config
authorized
chmod 777 ~/kibana/data ~/kibana/plugins ~/kibana/config
2. Modify the configuration file
vim ~/kibana/config/kibana.yml
添加如下代码段:
server.name: kibana
server.host: "0"
# 需要连接的地址
elasticsearch.hosts: [ "http://ip地址:9200" ]
xpack.monitoring.ui.container.elasticsearch.enabled: true
2. docker start
docker run -d --privileged=true --name kibana -p 5601:5601 -v ~/kibana/config/kibana.yml:/usr/share/kibana/config/kibana.yml -v ~/kibana/data:/usr/share/kibana/data -v ~/kibana/plugins:/usr/share/kibana/plugins kibana:7.14.0
3. Visit kibana
http://127.0.0.1:5601/
3. Core concepts
index
一个索引就是一个拥有几分相似特征的文档的集合
. For example, you could have an index for product data, an index for order data, and an index for user data. **,** And when we want to index, search, update and delete the documents in this index, we must use this name.一个索引由一个名字来标识
(必须全部是小写字母的)
map
映射是定义一个文档和它所包含的字段如何被存储和索引的过程
. In the default configuration, ES can be based on the inserted data 自动地创建mapping,也可以手动创建mapping
. Mapping mainly includes field names, field types, etc.
document
文档是索引中存储的一条条数据。一条文档是一个可被索引的最小单元
. Documents in ES are represented by lightweight JSON format data.
3.1 Simple interaction between kibana and es
3.1.1 Index
create
Syntax:
PUT /索引名 ====> PUT /student_info
When creating an index by default, a backup index and a primary index will be created for the index
# 创建索引 进行索引分片配置
PUT /student_info
{
"settings": {
"number_of_shards": 1, #指定主分片的数量
"number_of_replicas": 0 #指定副本分片的数量
}
}
Inquire
grammar:
GET /_cat/indices?v
delete
Syntax:
DELETE /索引名
DELETE /*
* stands for wildcard, stands for all indexes
3.1.2 Mapping
Common types:
- String type: keyword keyword keyword, text a piece of text
- Number type: integer long
- Decimal type: float double
- Boolean type: boolean
- Date type: date
Multiple types: see the official website document https://www.elastic.co/guide/en/elasticsearch/reference/7.15/mapping-types.html
to create
PUT /student_info
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 0
},
"mappings": {
"properties": {
"id":{
"type": "text"
},
"name":{
"type": "text"
},
"school":{
"type": "text"
},
"age":{
"type": "integer"
},
"createTime":{
"type": "date"
},
"updateTime":{
"type": "date"
}
}
}
}
Query - view the mapping of an index
grammar:
GET /索引名/_mapping
3.1.3 Documents
add document
Syntax:
POST /索引名/_doc/
Automatically generate document id
Syntax:POST /索引名/_doc/文档id
Specify to generate document id=1The default ID consists of a 20-byte random UUID (Universally Unique Identifier)
Query documents:
grammar:
GET /索引名/_doc/文档id
Update docs:
Syntax 1:
PUT /索引名/_doc/文档id { "属性名":"属性值" }
说明: 这种更新方式是先删除原始文档,在将更新文档以新的内容插入。
Grammar 2:
POST /索引名/_doc/文档id/_update {"doc" : { "属性名" : "属性值" }}
说明: 这种方式是将数据原始内容保存,再更新。
Delete document:
grammar:
DELETE /索引名/_doc/文档id
batch operation
- Batch operation addition, deletion and modification
# 批量操作增删改
POST /student_info/_doc/_bulk
{
"create": {
"_index": "student_info",
"_type": "_doc",
"_id": "8"
}
}
{
"id": "8",
"nickname": "王者荣耀"
}
{
"update": {
"_id": "7"
}
}
{
"doc": {
"nickname": "赵琪1"
}
}
{
"delete": {
"_id": "5"
}
}
- batch fetch
GET /_mget
{
"docs" : [
{
"_index" : "student_info",
"_type" : "_doc",
"_id" : "5"
},
{
"_index" : "student_info",
"_type" : "_doc",
"_id" : "6"
},
{
"_index" : "student_info",
"_type" : "_doc",
"_id" : "7"
}
]
}
3.2 Advanced query
ES provides a powerful way of retrieving data. This kind of retrieval method is called Query DSL
, and Query DSL
it uses Rest API传递JSON格式的请求体(Request Body)数据
interaction with ES. This method 丰富查询语法
makes ES retrieval easier 更强大,更简洁
.
Test Data
# 1.创建索引 映射
PUT /products/
{
"mappings": {
"properties": {
"title":{
"type": "keyword"
},
"price":{
"type": "double"
},
"created_at":{
"type":"date"
},
"description":{
"type":"text"
}
}
}
}
# 2.测试数据
PUT /products/_doc/_bulk
{
"index":{
}}
{
"title":"iphone12 pro","price":8999,"created_at":"2020-10-23","description":"iPhone 12 Pro采用超瓷晶面板和亚光质感玻璃背板,搭配不锈钢边框,有银色、石墨色、金色、海蓝色四种颜色。宽度:71.5毫米,高度:146.7毫米,厚度:7.4毫米,重量:187克"}
{
"index":{
}}
{
"title":"iphone12","price":4999,"created_at":"2020-10-23","description":"iPhone 12 高度:146.7毫米;宽度:71.5毫米;厚度:7.4毫米;重量:162克(5.73盎司) [5] 。iPhone 12设计采用了离子玻璃,以及7000系列铝金属外壳。"}
{
"index":{
}}
{
"title":"iphone13","price":6000,"created_at":"2021-09-15","description":"iPhone 13屏幕采用6.1英寸OLED屏幕;高度约146.7毫米,宽度约71.5毫米,厚度约7.65毫米,重量约173克。"}
{
"index":{
}}
{
"title":"iphone13 pro","price":8999,"created_at":"2021-09-15","description":"iPhone 13Pro搭载A15 Bionic芯片,拥有四种配色,支持5G。有128G、256G、512G、1T可选,售价为999美元起。"}
3.2.1 Query all [match_all]
match_all keyword: returns all documents in the index
GET /products/_search
{
"query": {
"match_all": {
}
}
}
3.2.2 Keyword query (term)
term keyword : used to query using keywords
GET /products/_search
{
"query": {
"term": {
"price": {
"value": 4999
}
}
}
}
3.2.3 Range query [range]
range keyword : used to specify to query documents within the specified range
GET /products/_search
{
"query": {
"range": {
"price": {
"gte": 5000,
"lte": 9999
}
}
}
}
3.2.4 Prefix query [prefix]
prefix keyword : Used to retrieve related documents containing keywords with the specified prefix
GET /products/_search
{
"query": {
"prefix": {
"title": {
"value": "iph"
}
}
}
}
3.2.5 Wildcard query [wildcard]
wildcard keyword : Wildcard query
? Used to match one arbitrary character * Used to match multiple arbitrary characters
GET /products/_search
{
"query": {
"wildcard": {
"title": {
"value": "iphone1?"
}
}
}
}
3.2.6 Multi-id query [ids]
ids keyword : the value is an array type, used to obtain multiple corresponding documents according to a set of ids
GET /products/_search
{
"query": {
"ids": {
"values": ["pAJg84YBl2A7w00GciqN","pQJg84YBl2A7w00GciqN"]
}
}
}
3.2.7 Fuzzy query [fuzzy]
fuzzy keyword : used to fuzzy query documents containing the specified keyword
Ambiguity-official website address
Note:fuzzy 模糊查询 最大模糊错误 必须在0-2之间
- The length of the search keyword is 2, ambiguity is not allowed
- The length of the search keyword is 3-5, allowing one fuzzy
- The search keyword length is greater than 5, allowing a maximum of 2 blurs
GET /products/_search
{
"query": {
"fuzzy": {
"description": "phone"
}
}
}
3.2.8 Boolean query [bool]
bool keyword : used to combine multiple conditions to achieve complex queries
1 , must: equivalent to && established at the same time
2 , should: equivalent to ||
GET /products/_search
{
"query": {
"bool": {
"should": [
{
"term": {
"title": {
"value": "iphone12"
}
}
},
{
"term": {
"price": {
"value": 8999
}
}
}
]
}
}
}
3.2.9 Multi-field query [multi_match]
Note: After the query condition is word-segmented, the field is queried. If the field is not word-segmented, the query condition will be queried as a whole.
GET /products/_search
{
"query": {
"multi_match": {
"query": "iphone13 OLED屏幕",
"fields": ["title","description"]
}
}
}
3.2.10 Default field word segmentation query [query_string]
GET /products/_search
{
"query": {
"query_string": {
"default_field": "description",
"query": "大屏幕银色边"
}
}
}
3.2.11 Highlight query [highlight]
highlight keyword : It can highlight keywords in qualified documents
1. Custom highlight html tags :
----used in highlightpre_tags
andpost_tags
2. Multi-field highlighting
- use torequire_field_match
enable multi-field highlighting
GET /products/_search
{
"query": {
"term": {
"description": {
"value": "iphone"
}
}
},
"highlight": {
"require_field_match": "false",
"post_tags": ["</span>"],
"pre_tags": ["<span style='color:red'>"],
"fields": {
"*":{
}
}
}
}
3.2.12 Return the specified number [size]
size keyword : specify the specified number of items to be returned in the query result. The default return value is 10
GET /products/_search
{
"query": {
"match_all": {
}
},
"size": 5
}
3.2.13 Paging query [form]
from keyword : used to specify the starting return position, used in conjunction with the size keyword to achieve paging effect
GET /products/_search
{
"query": {
"match_all": {
}
},
"size": 5,
"from": 0
}
3.2.14 Specify field sorting [sort]
GET /products/_search
{
"query": {
"match_all": {
}
},
"sort": [
{
"price": {
"order": "desc"
}
}
]
}
3.2.15 Return the specified field [_source]
_source keyword : It is an array, which is used to specify which fields to display in the array
GET /products/_search
{
"query": {
"match_all": {
}
},
"_source": ["title","description"]
}
3.3 Index Principle
3.3.1 Inverted Index
倒排索引(Inverted Index)
Also called a reverse index, there must be a forward index if there is a reverse index. In layman's terms,正向索引是通过key找value,反向索引则是通过value找key。ES底层在检索时底层使用的就是倒排索引。
3.3.2 Index Model
The existing indexes and mappings are as follows:
{
"products" : {
"mappings" : {
"properties" : {
"description" : {
"type" : "text"
},
"price" : {
"type" : "float"
},
"title" : {
"type" : "keyword"
}
}
}
}
}
First enter the following data, there are three fields title, price, description, etc.
_id | title | price | description |
---|---|---|---|
1 | Blue Moon Laundry Detergent | 19.9 |
Blue Moon Laundry Detergent 很 High Efficiency |
2 | iphone13 | 19.9 |
很 nice phone |
3 | Little raccoon crisp noodles | 1.5 | raccoon 很 delicious |
In ES, except for text type word segmentation, other types do not have word segmentation, so create indexes based on different fields as follows:
-
title field:
term _id (document id) Blue Moon Laundry Detergent 1 iphone13 2 Little raccoon crisp noodles 3 -
price field
term _id (document id) 19.9 [1,2] 1.5 3 -
description field
term _id term _id term _id blue 1 No 2 Small 3 moon 1 wrong 2 raccoon 3 Bright 1 of 2 Bear 3 wash 1 hand 2 good 3 Clothes 1 machine 2 eat 3 liquid 1 very [1:1:9,2:1:6,3:1:6] high 1 effect 1
注意: Elasticsearch分别为每个字段都建立了一个倒排索引。因此查询时查询字段的term,就能知道文档ID,就能快速找到文档。
3.3.3 Tokenizer
built-in tokenizer
standard
Analyzer - the default tokenizer, English is segmented by word and lowercasesimple
Analyzer - Segmentation by word (symbols are filtered), lowercase processingstop
Analyzer - lowercase processing, stop word filtering (the,a,is)whitespace
Analyzer - split according to spaces, do not convert to lowercasekeyword
Analyzer - does not split words, directly treats input as output
POST /_analyze
{
"analyzer": "standard",
"text": "this is a , good Man 中华人民共和国"
}
3.3.3.2 Create index and set word segmentation
PUT /索引名
{
"settings": {
},
"mappings": {
"properties": {
"title":{
"type": "text",
"analyzer": "standard" //显式指定分词器
}
}
}
}
3.3.3.3 Chinese tokenizer
There are many Chinese tokenizers supported in ES, such as smartCN , IK , etc., which is recommended IK分词器
.
Install the IK tokenizer
Open source address-Ik tokenizer-github
Open source address-Ik tokenizer-gitee
注意
The version of the IK tokenizer requires you to install the same version of ES注意
Docker container running ES installation plugin directory is /usr/share/elasticsearch/plugins注意
If the download is a maven structure, maven needs to be packaged and moved to the plugin under the jar file under the target注意
There are steps to use under the warehouse- correct directory
# 1. 下载对应版本
wget https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v7.14.0/elasticsearch-analysis-ik-7.14.0.zip
# 2. 解压 #先使用yum install -y unzip
unzip elasticsearch-analysis-ik-7.14.0.zip
# 3. 移动解压文件到es的挂载目录下
如:~/es/plugins
# 4. 重启es生效
# 5. 本地安装ik配置目录为
- es安装目录中/plugins/analysis-ik/config/IKAnalyzer.cfg.xml
IK use
IK has two granularity splits:
ik_smart
: will do the most coarse-grained split
ik_max_word
: will split the text into the finest granularity
Expanded words, stop words configuration
IK supports customization 扩展词典
and停用词典
扩展词典
Some words are not keywords, but they also hope to be used by ES as keywords for retrieval, and these words can be added to the extended dictionary.停用词典
Some words are keywords, but you do not want to use these keywords to be retrieved for business scenarios, you can put these words into the disabled dictionary.
Defining extended dictionaries and disabling dictionaries can modify this file config
in the directory of the IK tokenizer .IKAnalyzer.cfg.xml
1. 修改vim IKAnalyzer.cfg.xml
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<comment>IK Analyzer 扩展配置</comment>
<!--用户可以在这里配置自己的扩展字典 -->
<entry key="ext_dict">ext_dict.dic</entry>
<!--用户可以在这里配置自己的扩展停止词字典-->
<entry key="ext_stopwords">ext_stopword.dic</entry>
</properties>
2. 在ik分词器目录下config目录中创建ext_dict.dic文件 编码一定要为UTF-8才能生效
vim ext_dict.dic 加入扩展词即可
3. 在ik分词器目录下config目录中创建ext_stopword.dic文件
vim ext_stopword.dic 加入停用词即可
4.重启es生效
3.4 Filter query "Filter Query"
The query operations in ES are divided into two types: 查询(query)
and 过滤(filter)
. query查询
By default, a score is calculated for each returned document, and then sorted by score. Instead, 过滤(filter)
it only filters out matching documents, does not calculate scores, and it can cache documents. From a performance standpoint, filtering is faster than querying.过滤适合在大范围筛选数据,而查询则适合精确匹配数据。一般应用时, 应先使用过滤操作过滤数据, 然后使用查询匹配数据。
use
GET /products/_search
{
"query": {
"bool": {
"must": [
{
"match_all": {
}} //查询条件
],
"filter": {
....} //过滤条件
}
}
注意:
- When executing filter and query, first execute filter and then execute query
- Elasticsearch automatically caches frequently used filters to speed up performance.
type
Common filtering types include: term , terms , ranage, exists, ids and other filters.
term , terms filter (condition)
# 使用term过滤器
GET /products/_search
{
"query": {
"bool": {
"must": [{
"match_all": {
}}],
"filter": {
"term": {
"description":"iphone"
}
}
}
}
}
#使用terms过滤器
GET /products/_search
{
"query": {
"bool": {
"must": [{
"match_all": {
}}],
"filter": {
"terms": {
"description": [
"13",
"宽度"
]
}
}
}
}
}
ranage filter (range)
GET /products/_search
{
"query": {
"bool": {
"must": [{
"match_all": {
}}],
"filter": {
"range": {
"price": {
"gte": 1000,
"lte": 6666
}
}
}
}
}
}
exists filter (exists)
Filter documents with a null value in a specified field, and only find documents with a value in a specific field
GET /products/_search
{
"query": {
"bool": {
"must": [{
"match_all": {
}}],
"filter": {
"exists": {
"field": "description"
}
}
}
}
}
ids filter
Filter index records containing specified fields
GET /products/_search
{
"query": {
"bool": {
"must": [{
"match_all": {
}}],
"filter": {
"ids": {
"values": ["SsAW94YB8GbnoR-aVwio","TMAW94YB8GbnoR-aVwiw"]
}
}
}
}
}
4. Integrate springboot
4.1 Introducing dependencies
<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>
4.2 Configure client
@Configuration
public class RestClientConfig extends AbstractElasticsearchConfiguration {
@Override
@Bean
public RestHighLevelClient elasticsearchClient() {
final ClientConfiguration clientConfiguration = ClientConfiguration.builder()
.connectedTo("172.16.91.10:9200")
.build();
return RestClients.create(clientConfiguration).rest();
}
}
4.3 Client Objects
- ElasticsearchOperations
RestHighLevelClient
recommend
related notes
@Document(indexName = "es_product")//创建索引的名称
public class ESProduct {
@Id //@Id 用在属性上 作用:将对象id字段与ES中文档的_id对应
@Field(type = FieldType.Text)
private String id;
@Field(type=FieldType.Text,analyzer="ik_max_word") //type: 用来指定字段类型,analyzer:指定分词器
private String title;
@Field(type = FieldType.Double)
private Double price;
@Field(type=FieldType.Text,analyzer="ik_max_word")
private String description;
//格式化时间日期
@Field( type = FieldType.Date,format = DateFormat.custom, pattern = "yyyy-MM-dd HH:mm:ss")
@JsonFormat(shape = JsonFormat.Shape.STRING, pattern ="yyyy-MM-dd HH:mm:ss")
private Date createTime;
@Field( type = FieldType.Date,format = DateFormat.custom, pattern = "yyyy-MM-dd HH:mm:ss")
@JsonFormat(shape = JsonFormat.Shape.STRING, pattern ="yyyy-MM-dd HH:mm:ss")
private Date updateTime;
}
Integrated Query
public Map<String, Object> searchProduct(QueryReq queryReq) throws IOException {
Map<String, Object> result = new HashMap<>();
// 指定只能在哪些文档库中查询:可以添加多个且没有限制,中间用逗号隔开 --- 默认是去所有文档库中进行查询
SearchRequest searchRequest = new SearchRequest("es_product");
SearchSourceBuilder sourceBuilder = new SearchSourceBuilder();
sourceBuilder.timeout(new TimeValue(60, TimeUnit.SECONDS)); //设置超时时间
String[] includeFields = new String[] {
"id","title","price","description","createTime"};
String[] excludeFields = new String[] {
""};
//多字段高亮
HighlightBuilder highlightBuilder = new HighlightBuilder();
HighlightBuilder.Field highlightTitle = new HighlightBuilder.Field("title");
highlightBuilder.field(highlightTitle);
HighlightBuilder.Field highlightDescription = new HighlightBuilder.Field("description");
highlightBuilder.field(highlightDescription);
highlightBuilder.requireFieldMatch(false).preTags("<span style='color:red;'>").postTags("</span>");
sourceBuilder.fetchSource(includeFields, excludeFields);
sourceBuilder
//分页
.from((queryReq.getPage() - 1) * queryReq.getLimit())
.size(queryReq.getLimit())
.sort("price", SortOrder.DESC)
.fetchSource(includeFields, excludeFields)
.highlighter(highlightBuilder);
BoolQueryBuilder all = QueryBuilders.boolQuery()
.must(QueryBuilders.matchAllQuery());
//检索title和description
if(!StringUtils.isEmpty(queryReq.getKeyword())){
all.filter(QueryBuilders.multiMatchQuery(queryReq.getKeyword(), "description", "title"));
}
//价格
if(queryReq.getMin_price() != null){
all.filter(QueryBuilders.rangeQuery("price").gte(queryReq.getMin_price()));
}
if(queryReq.getMax_price() != null){
all.filter(QueryBuilders.rangeQuery("price").lte(queryReq.getMax_price()));
}
sourceBuilder.query(all);
searchRequest.source(sourceBuilder);
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
//处理结果
SearchHit[] hits = searchResponse.getHits().getHits();
List<ESProduct> list = new ArrayList<>();
ObjectMapper objectMapper = new ObjectMapper();
for (SearchHit hit : hits) {
ESProduct esProduct = objectMapper.readValue(hit.getSourceAsString(), ESProduct.class);
Map<String, HighlightField> highlightFields = hit.getHighlightFields();
if (highlightFields.containsKey("title")) {
esProduct.setTitle(highlightFields.get("title").getFragments()[0].toString());
}
if (highlightFields.containsKey("description")) {
esProduct.setDescription(highlightFields.get("description").getFragments()[0].toString());
}
list.add(esProduct);
}
long totalHits = searchResponse.getHits().getTotalHits().value;
result.put("data",list);
result.put("count",totalHits);
result.put("code",0);
return result;
}
Aggregate query (similar to group by in SQL
Official Website Documentation - Aggregation
Aggregation: English is Aggregation, which is a function for statistical analysis of es data provided by es in addition to the search function . Aggregation helps provide aggregated data based on search queries. Aggregation query is an important functional feature in databases. As a search engine and database, ES also provides powerful aggregation analysis capabilities.
Bucket aggregation (Bucket Aggregation)
Bucket aggregation is an aggregation method that divides documents into multiple buckets (Bucket) for statistics. For example, documents can be grouped according to a certain field , and statistical results such as the number of documents, maximum value, minimum value, and average value in each group can be returned. Bucket aggregation can be nested to achieve more complex statistics and analysis.
Metric Aggregation (Metric Aggregation)
Metric Aggregation is an aggregation method for performing numerical calculations on document collections. For example, you can sum, average, calculate the maximum, minimum, etc. on numeric fields in a collection of documents.
Pipeline Aggregation (Pipeline Aggregation)
pipeline aggregation is an aggregation method for reprocessing the aggregation results. For example, operations such as sorting, filtering, calculating percentiles, and calculating moving averages can be performed on the results of a certain bucket aggregation, so as to conduct deeper analysis and understanding of the data based on the aggregation.
#求和
GET /es_product/_search
{
"size":0,
"aggs":{
"aggs_name=sum_price":{
"sum":{
"field":"price"
}
}
}
}
#最大值
GET /es_product/_search
{
"size":0,
"aggs":{
"max_price":{
"max":{
"field":"price"
}
}
}
}
#最小值(Min)
GET /es_product/_search
{
"size":"0",
"aggs":{
"min_price":{
"min":{
"field":"price"
}
}
}
}
#平均值(Avg)
GET /es_product/_search
{
"size":"0",
"aggs":{
"avg_price":{
"avg":{
"field":"price"
}
}
}
}
#去重数值(cardinality)不同价格的商品件数
GET /es_product/_search
{
"size":0,
"aggs":{
"price_count":{
"cardinality": {
"field": "price"
}
}
}
}
#多值查询-最大最小值和平均值
GET /es_product/_search
{
"size":0,
"aggs":{
"max_price":{
"max":{
"field":"price"
}
},
"min_price":{
"min":{
"field":"price"
}
},
"avg_price":{
"avg":{
"field":"price"
}
}
}
}
# 返回多个聚合值(Status) --直接显示多种聚合结果,总记录数,最大值,最小值,平均值,汇总
GET /es_product/_search
{
"size":0,
"aggs":{
"price_stats":{
"stats":{
"field":"price"
}
}
}
}
Integrate application – return multiple aggregated values
public Map<String, Object> aggregation() throws IOException {
SearchRequest searchRequest = new SearchRequest("es_product");
SearchSourceBuilder aggregationBuilder = new SearchSourceBuilder();
aggregationBuilder.aggregation(AggregationBuilders.stats("priceStatsAgg").field("price"));
searchRequest.source(aggregationBuilder);
SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);
Aggregations aggregations = searchResponse.getAggregations();
ParsedStats statsAgg = aggregations.get("priceStatsAgg");
Map<String, Object> result = new HashMap<>();
List<Map<String, Object>> data = new ArrayList<>();
Map<String, Object> dataMap = new HashMap<>();
//获取聚合值
dataMap.put("min",statsAgg.getMin());
dataMap.put("max",statsAgg.getMax());
dataMap.put("avg",statsAgg.getAvg());
dataMap.put("sum",statsAgg.getSum());
dataMap.put("count",statsAgg.getCount());
data.add(dataMap);
result.put("data",data);
result.put("code",0);
return result;
}
Aggregation and query are at the same level, and can be aggregated in the query method, but the obtained aggs results must be obtained to obtain the corresponding type, otherwise a conversion error will occur.
Warehouse address: https://gitcode.net/chendi/springboot_elasticsearch_demo
Summarize
Elasticsearch is a distributed search engine based on Lucene, which can realize various functions such as full-text search, data analysis, and data mining.
- The index "index" is similar to the concept of database in mysql
- Mapping "mapping" is similar to the field name corresponding to the field type in mysql
- The document "document" is similar to a row of records in mysql
- The concept of word segmentation - type of word segmentation
- Advanced Search
- filter query
- Client object - Commonly used by RestHighLevelClient
- Aggregate queries are similar to mysql's groupby and other functions