Introduction to Elasticsearch and use in springboot

​What happens if

  1. If there are tens of millions of records in the table, this performance problem, and if there is another text field that needs to be vaguely configured in it, this will cause serious performance problems
  2. It is still not possible to separate the search terms. For example, the above can only search for employees whose names start with "Zhang San". If you want to search for "Zhang Xiaosan", you will not be able to search.
    Generally speaking, using databases to implement search is not very reliable, and usually the performance will be poor

Elasticsearch is an open source (distributed) search engine (analysis system)
is a distributed open source search and analysis engine developed on the basis of Apache Lucene.

  • A distributed real-time document store, each field can be indexed and searched
  • A distributed real-time analytical search engine
  • It is capable of expanding hundreds of service nodes and supports PB-level structured or unstructured data

Elasticsearch is a modern search engine that provides multiple functions such as persistent storage and statistics

It is both a search engine and a database [non-relational database], and does not support transactions and complex relationships (at least version 1.X does not support, 2.X has improved, but the support is still not good)

​Full-text search : The full-text database is the main component of the full-text search system. The so-called full-text database is a data collection formed by transforming the entire content of a complete information source into information units that can be recognized and processed by a computer. Full-text databases not only store information, but also have the functions of editing and processing words, characters, paragraphs, etc. on full-text data at a deeper level, and all full-text databases are all massive information databases.
For example : when we input "full disintegration", it will be split into two words "full" and "disintegration". Use 2 words to retrieve data in the inverted index, and return the retrieved data. The whole process is called full-text search

​Inverted index : Find records based on the value of the attribute. Since the attribute value is not determined by the record, but the location of the record is determined by the attribute value, it is called an inverted index.
Each entry in such an index table includes an attribute value and the addresses of records having the attribute value.
A file with an inverted index is called an inverted index file, or an inverted file for short.

Lucene is an open source search engine toolkit (a jar package, which contains packaged codes for building inverted indexes and searching, including various algorithms). Solr is also an open source distributed search engine based on
Lucene . There are many similarities with Elasticsearch.
Elasticsearch started with full-text search, and made the Lucene development kit into a data product, shielding various complex settings of Lucene, and providing developers with friendly convenience. Many traditional relational databases also provide full-text search, some based on Lucene embedded, some based on self-developed, compared with Elasticsearch, the function is single, the performance is not very good, and the scalability is almost non-existent.

ES solves these problems

1. Automatic maintenance of data distribution to the establishment of indexes on multiple nodes, as well as the execution of search requests distributed to multiple nodes
2. Automatic maintenance of redundant copies of data to ensure that data will not be lost once the machine goes down.
3. Encapsulates more advanced functions, such as the function of aggregation analysis and search based on geographic location

Features of ElasticSearch

  1. It can be used as a large-scale distributed cluster (hundreds of servers) technology to process PB-level data and serve large companies; it can also run on a single machine to serve small companies
  2. Elasticsearch is not a new technology. It mainly combines full-text retrieval, data analysis and distributed technology.
  3. For users, it is out-of-the-box, very simple, as a small and medium-sized application, deploy ES directly in 3 minutes
  4. Elasticsearch is a supplement to traditional databases, such as full-text search, synonym processing, correlation ranking, complex data analysis, and near real-time processing of massive data;

scenes to be used

If starting a new project using Elasticsearch as the only data store can help keep your design as simple as possible. However, this scenario does not support operations involving frequent updates and transactions.

An example is as follows : Create a new blog system and use es as storage.
1) We can submit new blog posts to ES;
2) Use ES to retrieve, search, and count data.

Scenario 2: Add Elasticsearch to the existing system
Since ES cannot provide all storage functions, in some scenarios it is necessary to add ES support on the basis of existing system data storage.
insert image description here
If you use a SQL database and ES storage as shown, you need to find a way 使得两存储之间实时同步. You need to select the corresponding synchronization plug-in according to the composition of the data and the database. Available plugins include:

1) mysql, oracle select logstash-input-jdbcplug-ins. 【Canal is also OK】

2) mongo selects the mongo-connector tool

Application query
Elasticsearch is best at querying. Based on the core algorithm of the inverted index, the query performance is stronger than all data products of the B-Tree type, especially the relational database. When the amount of data exceeds tens of millions or hundreds of millions, the efficiency of data retrieval is very obvious.

Elasticsearch is used in general query application scenarios. Relational databases are limited by the principle of the left side of the index. Index execution must have a strict order. If there are few query fields, you can improve query performance by creating a small number of indexes. If there are many query fields and the fields are out of order, Then the index loses its meaning; on the contrary, Elasticsearch creates indexes for all fields by default, and all field queries do not need to guarantee the order, so we use Elasticsearch instead of relational databases for general queries in a large number of business application systems. Database queries are very exclusive. Except for the simplest queries, Elasticsearch is used for other complex condition queries.
In the field of big data,
Elasticserach has become one of the important components of the big data platform to provide external queries. The big data platform iteratively calculates the original data, and then outputs the results to a database for query, especially for large batches of detailed data. ​The famous three-piece set of
log retrieval refers to a product portfolio specially designed for log collection, storage, and query. 5) Monitoring field 6) Machine learning
ELKElasticsearch,Logstash,Kibana

Basic introduction to ES components

insert image description here

Document Metadata
A document contains not only its data, but also metadata -- information about the document. The three required metadata elements are as follows:
  _index   Where is the document stored?
  _type    The object category represented by the document
  _id        . The unique identifier of the document.


# ES is used in springboot
  • Introduce dependencies

in pom.xml, joinspring-boot-starter-data-elasticsearch

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-data-elasticsearch</artifactId>
</dependency>
  • Write a configuration file
server:
  port: 9001
es:
  schema: http
  address: 192.168.0.1:9200
  connectTimeout: 10000
  socketTimeout: 20000
  connectionRequestTimeout : 50000
  maxConnectNum: 16
  maxConnectPerRoute: 20
  • Entity class:
@Data
@AllArgsConstructor
@NoArgsConstructor
@Document(indexName = "shen")
public class User {
    
    
    @Id 
    private String id;

    // 用在属性上 代表mapping中一个属性 一个字段 type:属性 用来指定字段类型 analyzer:指定分词器
    @Field(type = FieldType.Text,analyzer = "ik_max_word")
    private String name;

    @Field(type = FieldType.Integer)
    private Integer age;

    @Field(type = FieldType.Text)
    private Date bir;

    @Field(type = FieldType.Text,analyzer = "ik_max_word")
    private String introduce;

    @Field(type = FieldType.Text,analyzer = "ik_max_word")
    private String address;
}
  • Insert a document into es
	/**
	 * ElasticSearch Rest client操作
	 *
	 * RestHighLevelClient 更强大,更灵活,但是不能友好的操作对象
	 * ElasticSearchRepository 对象操作友好
	 *
	 * 我们使用rest client 主要测试文档的操作
	 **/
	// 复杂查询使用:比如高亮查询
	@Autowired
    RestHighLevelClient restHighLevelClient;

    @Override
    public ResponseInfo addUsers(User user) throws IOException {
    
    
        IndexRequest indexRequest = new IndexRequest("shen");
        indexRequest.source(JSONObject.toJSONString(user), XContentType.JSON);
        IndexResponse indexResponse = restHighLevelClient.index(indexRequest, RequestOptions.DEFAULT);
        return ResponseInfo.ok(indexResponse);
    }
  • call interface
    insert image description here
  • Use kibana to view the results in es:
    insert image description here
    /**
     * 更新
     */
	@Override
    public ResponseInfo updateDoc(User user) throws IOException {
    
    
        Document document = user.getClass().getAnnotation(Document.class);
        UpdateRequest updateRequest = new UpdateRequest(document.indexName(), user.getId());
        updateRequest.doc(JSONObject.toJSONString(user), XContentType.JSON);
        UpdateResponse updateResponse = restHighLevelClient.update(updateRequest, RequestOptions.DEFAULT);
        return ResponseInfo.ok(updateResponse);
    }

    /**
     * 删除
     */
    @Override
    public ResponseInfo deleteDoc(User user) throws IOException {
    
    
        Document document = user.getClass().getAnnotation(Document.class);
        DeleteRequest deleteRequest = new DeleteRequest(document.indexName(), user.getId());
        DeleteResponse deleteResponse = restHighLevelClient.delete(deleteRequest, RequestOptions.DEFAULT);
        return ResponseInfo.ok(deleteResponse);
    }

    /**
     * 批量更新
     */
    @Override
	public void bulkUpdate() throws IOException {
    
    
        BulkRequest bulkRequest = new BulkRequest();
        // 添加
        IndexRequest indexRequest = new IndexRequest("shen");
        indexRequest.source("{\"name\":\"张三\",\"age\":23,\"bir\":\"1991-01-01\",\"introduce\":\"西藏\",\"address\":\"拉萨\"}", XContentType.JSON);
        bulkRequest.add(indexRequest);
        // 删除
        DeleteRequest deleteRequest01 = new DeleteRequest("shen","pYAtG3kBRz-Sn-2fMFjj");
        DeleteRequest deleteRequest02 = new DeleteRequest("shen","uhTyGHkBExaVQsl4F9Lj");
        DeleteRequest deleteRequest03 = new DeleteRequest("shen","C8zCGHkB5KgTrUTeLyE_");
        bulkRequest.add(deleteRequest01);
        bulkRequest.add(deleteRequest02);
        bulkRequest.add(deleteRequest03);
        // 修改
        UpdateRequest updateRequest = new UpdateRequest("shen","pYAtG3kBRz-Sn-2fMFjj");
        updateRequest.doc("{\"name\":\"曹操\"}",XContentType.JSON);
        bulkRequest.add(updateRequest);

        BulkResponse bulkResponse = restHighLevelClient.bulk(bulkRequest, RequestOptions.DEFAULT);
        BulkItemResponse[] items = bulkResponse.getItems();
        for (BulkItemResponse item : items) {
    
    
            System.out.println(item.status());
        }
    }
/**
     * 查询
     * @throws IOException
     */
    @Test
    public void testSearch() throws IOException {
    
    
        //创建搜索对象
        SearchRequest searchRequest = new SearchRequest("shen");
        //搜索构建对象
        SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();

        searchSourceBuilder.query(QueryBuilders.matchAllQuery())//执行查询条件
                .from(0)//起始条数
                .size(10)//每页展示记录
                .postFilter(QueryBuilders.matchAllQuery()) //过滤条件
                .sort("age", SortOrder.DESC);//排序

        //创建搜索请求
        searchRequest.source(searchSourceBuilder);

        SearchResponse searchResponse = restHighLevelClient.search(searchRequest, RequestOptions.DEFAULT);

        System.out.println("符合条件的文档总数: "+searchResponse.getHits().getTotalHits());
        SearchHit[] hits = searchResponse.getHits().getHits();
        for (SearchHit hit : hits) {
    
    
            System.out.println(hit.getSourceAsMap());
        }
    }





ES official website

kibana usage

ElasticSearch is used in springboot

Guess you like

Origin blog.csdn.net/JemeryShen/article/details/126488385