(1) Understanding and installation of ElasticSearch

1. Understanding ES

Elasticsearch is a search engine based on the Lucene library. It provides a distributed, multi-tenant enabled full-text search engine with an HTTP web interface and schema-less JSON documents. Elasticsearch is developed in Java and released as open source software under the SSPL+Elastic License. Official clients are available in Java, .NET (C#), PHP, Python, Apache Groovy, Ruby, and many other languages.

(1) Basic characteristics

  1. ES is distributed, and ES can also run on a single point or on multiple nodes. As a large-scale distributed cluster (hundreds of servers) technology, it processes PB-level data and serves large companies; it can also be run on a single machine to serve small companies. Data is hashed on different nodes through sharding algorithms to achieve high availability, load balancing, and distributed search services.
  2. ES implements full-text retrieval. The reason why Mysql is slower is to match through title and content fields. As a supplement to traditional databases, Elasticsearch provides many functions that databases cannot provide.
  3. ES real-time and fast
  4. ES exposes Restful interface to the outside world. Compared with Lucene, it is more friendly to programmers. For users, it is very simple to use out of the box. As a small and medium-sized application, you can deploy ES in 3 minutes and use it as a production environment system. The amount of data is not large and the operation is not too complicated.

(2).Comparison with lucene

1.lucene

Lucene is a Java-based full-text information retrieval toolkit. It is not a complete search application, but provides indexing and search functions for your application.

The usage process is roughly as follows:

  • Indexing process: Collect data –> Build document object –> Analyze document (word segmentation) –> Create index.
  • Search process: that is, the user goes through the search interface -> creates a query -> performs a search, and the searcher searches from the index library -> renders the search results.

2. ES

Elasticsearch is an open source search engine based on Apache Lucene™.

Lucene is just a library . To use it, you have to use Java as the development language and integrate it directly into your application . To make matters worse, Lucene is very complex and you need to have in-depth knowledge of retrieval to understand how it works.
Elasticsearch is also developed in Java and uses Lucene as its core to implement all indexing and search functions, but its purpose is to hide the complexity of Lucene through a simple RESTful API to make full-text search simple.

Lucene is an information retrieval toolkit that does not include a search engine system. It includes index structure, read-write index tools, correlation tools, sorting and other functions. Therefore, when using Lucene, you still need to pay attention to the search engine system, such as data acquisition. , analysis, word segmentation and other aspects. Elasticsearch is all encapsulated based on this toolkit.

(3).ES function

Elasticsearch is not a new technology. It mainly combines full-text retrieval, data analysis and distributed technologies to form a unique ES.

(1) Distributed search engine and data analysis engine

Search : Baidu, website search, IT system retrieval
data analysis : e-commerce website, who are the top 10 sellers of toothpaste products in the past 7 days; news website, top 3 news in terms of visits in the past month What are the sections?

Distributed, search, data analysis

(2) Full-text search, structured search, data analysis

Full text search : I want to search for products whose product names include toothpaste, select * from products where product_name like "%toothpaste%"
Structured search : I want to search for products classified as daily chemicals, select * from products where category_id ='daily chemicals'

Partial matching, auto-complete, search error correction, search recommendation
Data analysis: We analyze how many products there are under each product category, select category_id, count(*) from products group by category_id

(3) Process massive amounts of data in near real-time

Distributed : ES can automatically distribute massive data to multiple servers to store and retrieve
massive data processing : After being distributed, a large number of servers can be used to store and retrieve data, and processing of massive data can naturally be achieved.

Near real-time : It takes 1 hour to retrieve data (this does not require near real-time, offline batch processing, batch-processing); the data is processed at the second level

Search and analysis are the opposite of distributed/massive data: Lucene, a stand-alone application, can only be used on a single server, and can only handle the maximum amount of data that a single server can handle.

(4). Current applications

Insert image description here

2. Applicable scenarios

1. Build enterprise-level search services, such as e-commerce systems, knowledge systems, log analysis, etc.

2. When you need to use search analysis on a large amount of data, you can consider it first.

3. Elasticsearch has a wide range of application scenarios, including full-text search, log analysis, operation and maintenance monitoring, security analysis, etc.

三、ES VS Mysql

 ES 6.x version recommends trying to maintain one type in an index.

 4. ES installation and deployment

ES is implemented from java source, and it is necessary to ensure that the server has a java running environment. I am currently using Es8.4.1. ES relies on JDK. Versions 7 and above will come with jdk. We can also see this from the downloaded installation package.

 Official website link: Free and open search: developers of Elasticsearch, ELK and Kibana | Elastic

 After decompression, click bat to start ES.

 However, after I clicked, it crashed.

 I can't understand this problem. I deleted andriodSDK and still got this error. I simply downgraded the ES version to 7.17.6.

After changing the version, I encountered the problem again.

SecurityNetty4HttpServerTransport] [DESKTOP-0QU7RUU] received plaintext http traffic on an https channel, closing connection Netty4HttpChannel{localAddress=/127.0.0.1:9200

After this is resolved, visit: http://localhost:9200/

It should be noted that the ES8 version requires a username and password. The username is elastic and the password can be found in the startup log. I am currently at level 7 so there is no step to enter the password.

If you can access it, it means the startup is successful.


In order to facilitate learning ES, you can also install postman and kibana tools.

postman: Anyone who develops knows about interface testing tools

kibana: It is a member of elk technology. You can easily perform advanced data analysis and use rich charts to visualize query results.

The following mainly talks about installing kibana:

Download address: Download Kibana Free | Get Started Now | Elastic

 You need to download kibana that is consistent with the ES version, at least the major version is consistent.

It should be noted that before starting kibana, you must first start ES.

kibana access address:

Guess you like

Origin blog.csdn.net/heni6560/article/details/126806586