ElasticSearch search engine download and install

Table of contents

1. Introduction to Solr & luncene

1.1. Solr overview

1.2. Overview of Luncene

1.3 Comparison between ElasticSearch and Solr

2. Overview of ElasticSearch

2.1. What is ElasticSearch?

2.2. Use cases of ElasticSearch

3. Download and install ElasticSearch

3.1, ElasticSearch download & installation

3.2. Start ElasticSearch

4. ElasticSearch plug-in installation

4.1. Installation dependencies

4.2. Install Taobao mirror image

4.3, Head plug-in

4.4, set cross-domain

4.5. Kibana plugin

4.6. Sinicization of Kibana

4.7. Possible problems and solutions


1. Introduction to Solr & luncene

1.1. Solr overview

Solr is a top-level open source project under Apache, developed in Java, and it is a full-text search server based on Lucene. Solr provides a richer query language than Lucene, and is configurable and scalable, and optimizes indexing and search performance. Solr can run independently and run in Servlet containers such as Jetty and Tomcat. The implementation method of Solr index is very simple. Use the POST method to send an XML document describing Field and its content to the Solr server, and Solr can add, delete, and update according to the xml document. index. Solr search only needs to send an HTTP GET request, and then parse the query results returned by Solr in Xml, json and other formats to organize the page layout. Solr does not provide the function of building a UI. Solr provides a management interface through which you can query the configuration and running status of Solr.
Solr develops an enterprise-level search server based on lucene, which actually encapsulates lucene.
Solr is an independent enterprise-level search application server that provides an API interface similar to Web-service. Users can submit files in a certain format to the search engine server through http requests to generate indexes; they can also make search requests and get returned results.

1.2. Overview of Luncene

Lucene is a sub-project of the 4 jakarta project team of the Apache Software Foundation. It is an open source full-text search engine toolkit, but it is not a complete full-text search engine, but a full-text search engine architecture that provides a complete Query engine and index engine, partial text analysis engine (two Western languages, English and German). The purpose of Lucene is to provide software developers with an easy-to-use toolkit to facilitate the full-text search function in the target system, or to build a complete full-text search engine based on this. (an information retrieval toolkit)

Lucene is an open source library for full-text retrieval and search, supported and provided by the Apache Software Foundation. Lucene provides
a simple but powerful API for full-text indexing and searching. Lucene is a mature free open source tool in the Java development environment. For its part, Lucene is currently and in recent years the most popular free Java information retrieval library. People often refer to IR libraries, although they are related to search engines, but IR libraries should not be confused with search engines. Lucene is a full-text search engine architecture. So what is a full-text search engine?

Full-text search engines are veritable search engines. Representative foreign ones include Google, Fast/AllTheWeb, AltaVista, Inktomi
, Teoma, WiseNut, etc. Domestic famous ones include Baidu. They all retrieve relevant records that match the user's query conditions from the database established by extracting information from various websites on the Internet (mainly webpage text), and then return the results to the user in a certain order, so they is a real search engine.

From the perspective of the source of search results, full-text search engines can be subdivided into two types, one is to have its own retrieval program (Indexer), commonly known as "Spider" (Spider) program or "Robot" (Robot) program, and self-built The webpage database, the search results are directly called from its own database, such as the seven engines mentioned above; the other is to rent the database of other engines and arrange the search results in a custom format, such as the Lycos engine.

1.3 Comparison between ElasticSearch and Solr

1. ES is basically out of the box, very simple. Solr installation is slightly more complicated.

2. Solr uses Zookeeper for distributed management, while Elasticsearch itself has a distributed coordination management function.

3. Solr supports data in more formats, such as JSON, XML, and CSV, while Elasticsearch only supports the json file format.

4. Solr officially provides more functions, while Elasticsearch itself focuses more on core functions, and most advanced functions are provided by third-party plug-ins. For example, the graphical interface requires kibana-friendly support.

5. Solr is fast in query, but slow in updating the index (that is, inserting and deleting is slow), which is used for applications with many queries such as e-commerce; ES is fast in
    indexing (that is, slow in query), that is, fast in real-time query, and is used for searching on Facebook, Sina, etc. .
    Solr is a powerful solution for traditional search applications, but Elasticsearch is more suitable for emerging real-time search applications.

6. Solr is relatively mature, with a larger and more mature community of users, developers, and contributors, while Elasticsearch has relatively fewer developers and maintainers, and the update is too fast, so the cost of learning and using is relatively high.

2. Overview of ElasticSearch

2.1. What is ElasticSearch?

Elasticsearch, referred to as es for short, is an open source distributed, Restful-style search and data analysis engine. Its bottom layer is the open source library Apache Lucene. es is actually a search server, which is what we call query, so why not use mysql database to query? Example: select * from user where name like %zhangsan%; will scan the whole disk, and the efficiency cannot be guaranteed.

The difference between MySQL and ElasticSearch:

MySQL is transactional, but ElasticSearch has no transaction. All the data you delete cannot be recovered.
MySQL and ElasticSearch have different divisions of labor, and MySQL is responsible for storing data. ElasticSearch is responsible for searching data.
MySQL synchronizes the data to ElasticSearch. In the future, instead of querying the database, it will query from ElasticSearch

Features of ElasticSearch: 

1. A distributed real-time document storage, each field can be indexed and searched.
2. A distributed real-time analysis search engine.
3. It is capable of expanding hundreds of service nodes and supports PB-level structured or unstructured data.
4. Elasticsearch is file storage, and Elasticsearch is a document-oriented database, where a piece of data is a document.

2.2. Use cases of ElasticSearch

1. Wikipedia, similar to Baidu Encyclopedia, full-text search, highlight, search recommendation

2. The Guardian (foreign news website), similar to Sohu News, user behavior logs (clicks, browsing, favorites, comments) + social network data (related views on XX news), data analysis, and feedback for each news article Author, let him know public feedback on his article (good, bad, popular, trash, despised, adored)

3. Stack Overflflow (foreign program exception discussion forum), IT problems, program error reports, submit them, someone will discuss and answer with you, full-text search, search for related questions and answers, if the program reports an error, the error message will be pasted Go inside and search for the corresponding answer.

4. GitHub (open source code management), search hundreds of billions of lines of code.

5. E-commerce website, retrieve products.

6. Log data analysis, logstash collects logs, ES performs complex data analysis, ELK technology, elasticsearch+logstash+kibana

7. Commodity price monitoring website, the user sets the price threshold of a certain commodity, and when it is lower than the threshold, a notification message is sent to the user.

8. BI system, business intelligence, Business Intelligence. For example, there is a large shopping mall group, BI, which analyzes the trend of user consumption in a certain area in the last three years and the composition of user groups, and produces several related reports. 100% growth, and 85% of the user group is senior white-collar workers, opening a new shopping mall. ES performs data analysis and mining, and Kibana does data visualization.

9. Domestic: Search within the site (e-commerce, recruitment, portals, etc.), IT system search (OA, CRM, ERP, etc.), data analysis (a popular usage scene for ES).

3. Download and install ElasticSearch

ElasticSearch is developed in Java, and the jdk version required by the version of es is above 1.8, so before installing ElasticSearch, ensure that JDK1.8+ is installed and the JDK environment variables are correctly configured, otherwise ElasticSearch will fail to start.

3.1, ElasticSearch download & installation

ElasticSearch is divided into Linux and Window versions. Based on the use of the Java client of ElasticSearch, we mainly use the Window version, which is relatively easy to install. After the project is launched, the company's operation and maintenance personnel will install the Linux version of ES for our connection.

Official Website: Elasticsearch: The Official Distributed Search and Analysis Engine | Elastic

 

If the network speed is too slow, the following is the Baidu network disk link of relevant information

Link: https://pan.baidu.com/s/1iAy2ODPoCjTEFUAzAeidiQ 
Extraction code: e6qg

The installation of ElasticSearch is very convenient, just unzip the downloaded jar package! Note: there should be no Chinese characters in the storage path!

3.2. Start ElasticSearch

If the computer memory is not large enough, when ElasticSearch is started, the computer memory occupied by default is 1G. If the computer memory is not large enough, we need to modify the [jvm.options] configuration file under the [config] directory. If the computer memory is relatively large, it should be more than 8G The memory does not need to be modified.

After modification, enter the bin directory and double-click elasticsearch.bat

Wait for a while, if the prompt has started, it means the startup is successful

Visit localhost:9200 page

If the following picture content appears, it means that the es installation is successful.

4. ElasticSearch plug-in installation

Auxiliary tools for operating es:
Postman tool Head plug-in Kibana plug-in provides friendly graphical pages for es.

4.1. Installation dependencies

The main purpose of installing Node.js is to install the ES GUI plug-in client, which depends on these front-end environments!

Node.js download link: Download | Node.js

The above Baidu Netdisk also provides an installation package with node.js.

The installation of Node.js is very simple, just click Next, after installation, open cmd and enter

node -v 

If the version prompt appears, it proves that it is installed 

Install grunt as a global command, Grunt is a Node.js-based project construction tool

-- Enter the following in cmd to install 
npm install -g grunt-cli

-- Check if the installation is successful
grunt -version

 

4.2. Install Taobao mirror image

Next, we will download some plug-ins needed by ES. These plug-ins are all foreign, and the download speed will be slower, so we will download from the domestic Taobao mirror, and the speed will be faster! Similar to the Maven project to install the Alibaba Cloud image (not required, it can be installed if the network is slow)

--npm's remote server is abroad, so sometimes it is inevitable that the access is too slow or even inaccessible. 
--Use Ali's customized cnpm command line tool instead of the default npm
npm install -g cnpm --registry=https://registry.npm.taobao.org

Enter the following code to verify whether the installation is successful

cnpm -v

If the following prompt appears, it proves that the installation is successful 

 

4.3, Head plug-in

The Head plug-in is the GUI plug-in client of ES, and it is also the officially recommended plug-in

下载地址:GitHub - mobz/elasticsearch-head: A web front end for an elastic search cluster

Baidu network disk has been provided, you can choose by yourself

Extract it to the es directory and place it at the same level as elasaticsearch 

Note that cmd must be entered into the E:\es\elasticsearch-head-master directory for operation
Installation command: cnpm install -- Taobao mirror has been installed Use cnpm
installation command: npm install 

In the Head decompression directory, the node_modules directory appears, indicating that the installation is successful

Head plug-in starts, note: to start the head, make sure that elasticsearch is running, because the head plug-in depends on elasticsearch.

-- Note that cmd must enter the head directory for operation
Command: npm run start or grunt server start 

When visiting localhost:9100, it will automatically request port 9200. There is a cross-domain problem, so it will fail to connect.

Because the port numbers of the ES process and the client process are different, there is a cross-domain problem. We need to configure the cross-domain problem in the ES configuration file.

4.4, set cross-domain

In the config directory under the es installation directory, open the elasticsearch.yml configuration file, and add the following to set the cross-domain configuration in the last line

http.cors.enabled: true 
http.cors.allow-origin: "*"

After setting, restart ES and visit Head again

4.5. Kibana plugin

KibanaThe data can elasticsearchbe displayed through a friendly page, providing the function of real-time analysis.

Note: If the computer memory is small, please install it carefully, because starting Kibana will take up special memory. It is recommended to have more than 8g of memory, otherwise it will freeze.

official website

https://www.elastic.co/cn/kibana/

Baidu Netdisk also provides it, download it yourself, if you download it officially, you must ensure that it is consistent with the version of es

No need to install, just need to decompress and use, because the file is relatively large, it will take some time to decompress! ! !

Go to the bin directory and double-click [kibana.bat] to start the service (you need to wait for the startup to complete). ELK is basically ready to use out of the box.

  

Visit localhost:5601, kibana will automatically visit 9200, which is the port number of elasticsearch (of course elasticsearch must be started at this time), and then you can use kibana.

4.6. Sinicization of Kibana

We saw that the homepage is in English, which seems to be difficult. This plug-in is mainly used in big data, and there are many related words that we don’t know, so we configure it in Chinese!

Find the kibana.yml configuration file of config and add the following configuration

i18n.locale: "zh-CN"

The configuration is complete, restart the service, and enter the home page again

 

4.7. Possible problems and solutions

 

Guess you like

Origin blog.csdn.net/select_myname/article/details/128070661