Build solr search engine in 30 minutes

Solr is an open source search engine written in java at the bottom. It has powerful performance, flexible configuration, rich APIs, a visual front-end panel, and is easy to debug. However, the installation process is complicated and has encountered many pitfalls. The following is a summary of the installation process.

 

Installation environment: centos7

 

 

1. Download the solr installation package

wget http://mirrors.shuosc.org/apache/lucene/solr/7.2.1/solr-7.2.1.zip

 

 

2. Unzip unzip solr-7.2.1.zip

 

 

3. Install the java environment

3.1 Check if it is installed

java -version

If already installed, remove the old version

yum remove java-x.x.x-openjdk

 

3.2 Installation

View installable list

yum -y list java*

select version install all packages

yum install java-1.8.0-openjdk*

 

3.3 Whether the test is successful

 

 

 

4. Start solr

mv solr-7.2.1 solr7.2.1

cd solr7.2.1

bin/solr start -force

 

 

5. Open 8983 firewall port

firewall-cmd --zone=public --add-port=8983/tcp --permanent

systemctl restart firewalld

 

 

6. Access port 8983

 

If it fails to respond, first check the port opening problem. Alibaba Cloud and other cloud servers need to add port 8983 to the inbound security group.

So far, solr is installed successfully

 

 

7. Install IK word segmentation

Since solr does not support Chinese word segmentation, it is necessary to install the IK word segmentation package to solve query segmentation, index segmentation and other problems

7.1 Download IK package

Link: https://pan.baidu.com/s/1smrpBOx 

Password: irdx

 

7.2 Unzip unzip ikanalyzer-solr6.5.zip

 

7.3 Copy to the specified directory

7.3.1. Copy the two jar packages to the lib directory of the solr installation directory

cp *jar /usr/local/solr/solr7.2.1/server/solr-webapp/webapp/WEB-INF/lib/

 

7.3.2 Create a new classes directory

mkdir /usr/local/solr/solr7.2.1/server/solr-webapp/webapp/WEB-INF/classes

 

7.3.2 Copy the xml file to classes

cp IKAnalyzer.cfg.xml /usr/local/solr/solr7.2.1/server/solr-webapp/webapp/WEB-INF/classes/

 

7.3.4 View the xml file and configure the ext.dic stopword.dic dictionary in the classes directory

 

 

7.4 Configure IK word segmentation

vim /usr/local/solr/solr7.2.1/server/solr/seo/conf/managed-schema

 

Here is a little explanation. The name attribute of the field is the index field, and the type is the field type. Here we write it as the text_ik that was just installed, indexed is whether the index is indexed, and stored is whether it is sorted (the id field is true, and line 113 already exists), Suppose your database has a bunch of data, which is divided into two fields: title and content, then you can configure it as shown in the figure, and then go down to the configuration of IK word segmentation, and you can copy it directly.

 

<!-- my field -->

<field name="title" type="text_ik" indexed="true"  stored="false"/>

<field name="content" type="text_ik" indexed="true"  stored="false"/>

<!-- IK tokenizer-->

<fieldType name="text_ik" class="solr.TextField">

<analyzer type="index">

<tokenizer class="org.apache.lucene.analysis.ik.IKTokenizerFactory" useSmart="true"/>

</analyzer>

<analyzer type="query">

<tokenizer class="org.apache.lucene.analysis.ik.IKTokenizerFactory" useSmart="true"/>

</analyzer>

</fieldType>

 

7.5 Restart solr (don't forget to restart whether you add packages or modify configurations)

/usr/local/solr/solr7.2.1/bin/solr restart -force

 

 

8. Get started with solr

8.1 New core

core can be understood as a project

bin/solr create -c seo -force

 

8.2 Go to the front end to check whether the IK word segmentation takes effect

8.2.1 Select core

 

8.2.2 Click Analysis and select text_ik

If text_ik appears, the IK word segmentation installation is successful, otherwise it fails.

 

8.2.3 Check word segmentation

 

 

8.2.4 Checking dictionary loading

 

"Samsung" is a brand name that has been cut out separately, and "star" is not cut out, and "?" has also been removed

 

 

8.3 Add index

8.3.1 There are three ways to add solr

1. Data import through the Dataimport of the panel

2. Manually add through Documents

3. Add via API

The first method will not be repeated. I personally feel that the configuration is troublesome and the use is not flexible.

For the second method, I directly used the python API. For details, see: http://blog.csdn.net/sinat_33455447/article/details/56848791

Be careful not to submit one by one. The appropriate method is to add data in groups, such as a group of 10,000, which is more efficient

 

8.3.2 Manual addition via Documents

Generally used for testing, click Documents, select the data form you want to submit, here is json as an example

 

 

8.4. Search Test

Click query, q enter query, df specify the search field

 

You can directly get the search result data by requesting the url in the red box, and you can use the wt option to specify the returned data type

 

There are still many details and key parameters that are limited by energy. It is hard to describe here. Please forward them a lot or add my WeChat account below to ask me questions. Your support is the driving force for me to share :)

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324985751&siteId=291194637