solr6.6 search environment construction, IK Chinese word segmentation, synonyms, pinyin, the use of solrj

 
 
2017-06-20 Apache official website released the download address of solr6.6 version
Download address: https://mirrors.tuna.tsinghua.edu.cn/apache/lucene/solr/6.6.0/
solr6.6 search environment construction, use of IK Chinese word segmentation synonym Pinyin solrj

 

Deployment environment : Tomcat8

solr version : 6.6

jdk:1.8

solr configuration :

Unzip solr, copy the webapp folder under the directory \solr-6.6.0\server\solr-webapp to
Tomcat's webapp directory. Renamed to solr
Copy all jar packages in olr-6.6.0/server/lib/ ext to \weba pp s\solr\WEB-INF\lib, and copy noggit-0.6 under \solr-6.6.0\dist\solrj-lib Copy the .jar to \webapp s\solr\WEB-INF\lib, and copy all jar packages starting with metrics- in olr -6.6.0/server/lib/ to \webapps\solr\WEB-INF \lib , Create a new classes folder under WEB-INF, and then copy log4j.properties in \solr-6.6.0\server \resources to classes.  

 
Copy the solr file in the \solr-6.6.0\server directory to any location. (I am G:\solrhome here) and rename it
It is solrhome. as the index storage location

 

In the solrhome directory, create a new collection1 folder and place /solr-6.6.0/server/solr/configsets/basic_configs
Copy the conf folder to the newly created collection1 folder. Create a new data folder in the collection1 directory.
Create the file core.properties in collection1 and write the content
name=collection1
config=solrconfig.xml
schema=schema.xml
dataDir=data

 

 

Go back to the Tomcat directory and open the \webapps\solr\WEB-INF\web.xml file. Open the commented <env-entry>,
Modify the path to the path of solrhome

 

    <env-entry>
       <env-entry-name>solr/home</env-entry-name>
       <env-entry-value>G:\solrhome</env-entry-value>
       <env-entry-type>java.lang.String</env-entry-type>
    </env-entry>
 Comment out the <security-constraint> at the bottom;


 

  the next step is to configure the IK Chinese analyzer: solr itself has a Chinese tokenizer, but it does not support adding vocabulary by itself, choose the IK tokenizer, but IK has stopped updating in 2012, But this attachment provides the latest IK support . ikanalyzer-solr6.6.zip copies ik-analyzer-solr5-5.x.jar and solr-analyzer-ik-5.1.0.jar to \solr\WEB-INF\ Under lib, copy the three files IKAnalyzer.cfg.xml, ext.dic, stopword.dic

 
to the ext.dic file under \solr\WEB-INF\classes for the custom extended vocabulary, stopword for the custom deactivation The word next starts to define fieldType and field. Copy the following code to the managed-schema file in the olrhome\collection1\conf\ directory of solrhome. This configuration contains the following synonyms, (synonyms will be said later)     Use the client provided by solr to start the association Database . Realize the import of batch data.
<requestHandler name="/dataimport"
class="org.apache.solr.handler.dataimport.DataImportHandler">
    <lst name="defaults">
      <str name="config">data-config.xml</str>
     </lst>
  </requestHandler>
 Create a data-config.xml file in the same level directory, I use mysql here, field is the mapping of the configuration field, column is the database field name. name is the name of the previous custom field Note: configure according to the actual situation
<?xml version="1.0" encoding="UTF-8" ?>  
<dataConfig>   
<dataSource type="JdbcDataSource"   
		  driver="com.mysql.jdbc.Driver"   
		  url="jdbc:mysql://localhost:3306/solrDemo"   
		  user="root"   
		  password="root"/>   
<document>   
	<entity name="product" query="SELECT pid,name,catalog_name,price,description,picture FROM products ">
		 <field column="pid" name="id"/>
		 <field column="name" name="product_name"/>
		 <field column="catalog_name" name="product_catalog_name"/>
		 <field column="price" name="product_price"/>
		 <field column="description" name="product_description"/>
		 <field column="picture" name="product_picture"/>
	</entity>   
</document>   
</dataConfig>
Copy solr -dataimporthandler-6.6.0.jar and solr-dataimporthandler-extras-6.6.0.jar  in the solr-6.6.0\dist directory to solr/ WEB-INF/lib, in solrhome\collection1\conf\ The solrconfig.mxl configuration <lib dir=''solr's lib path> was originally found in the solr6.6 file as a relative path, which is convenient for management, and will not change the dependencies of the pickup package due to the path.


 
 Then open the client, click Execute, As follows:
Note: If you do not check Auto-Refresh Status, it will not prompt whether all indexes are successfully established. Click Refresh on the right to refresh the
entity in data-config.xml.

 
 
Then you can click Query to query. No demonstration is made here. Note: Because of Chinese word segmentation, it has been divided into words and stored. When a single word query is performed, there will be no results that can be found. Solution 1 (not recommended): Use all single words as expansion words, disadvantage: The classification is too fine , which will affect the search results. Unwilling results appear Solution 2: Use wildcards *. Synonyms configuration: IK has been configured with synonyms in the previous configuration . Create a synonym index. Then start adding synonyms now. In the managed-schema sibling directoryIn synonyms.txt , as shown in the  figure, if you search for real estate, real estate, or real estate, three results will appear.

  Pinyin configuration:   Use of solrj: solr-solrj-6.6.0 in the \solr-6.6.0\dist directory.jarandsolr-core-6.6.0.jar and the jar package in the solrj-lib folder are copied to the project. I will not introduce too much about the use of solrj here , there is a lot of online content, here is a query Example. HttpSolrClient httpSolrClient = new HttpSolrClient (URL); found this method obsolete and now use

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326460234&siteId=291194637