07 solr the configuration word breaker, thesaurus, and extended thesaurus stop

In the previous sections, we looked at the basic usage solr of this section we will look at the configuration word in solr, a thesaurus and extended thesaurus stop.

1, the premise of restraint

2, the steps:

  • Before the test is not configured
    All the characters are a single word, not semantically

2.1, Configuration

  • Upload jar package IKAnalyzer2012FF_u1.jar
    will IKAnalyzer2012FF_u1.jar uploaded to /root/apache-tomcat-8.0.33/webapps/solr/WEB-INF/lib, you can complete the word of Chinese.
  • Modify /root/solr-4.10.3/example/solr/collection1/conf/schema.xml
    <fieldType name="text_ik" class="solr.TextField">
        <analyzer class="org.wltea.analyzer.lucene.IKAnalyzer"/>
    </fieldType>
    <field name="companyname" type="text_ik" indexed="true" stored="true"/>
    <field name="companydesc" type="text_ik" indexed="true" stored="true"/>
    <field name="item_keywords" type="text_ik"  indexed="true" stored="true" multiValued="true" />
    <copyField source="companyname" dest="item_keywords"/>
    <copyField source="companydesc" dest="item_keywords"/>
  • Creating /root/apache-tomcat-8.0.33/webapp/solr/WEB-INF/classes folder, create IKAnalyzer.cfg.xml, ext.dic, stopword.dic folder in the folder.
    IKAnalyzer.cfg.xml reads as follows:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd"> 
<properties>  
    <comment>IK Analyzer 扩展配置</comment>
    <!--用户可以在这里配置自己的扩展字典     -->
    <entry key="ext_dict">ext.dic;</entry> 
    <!--用户可以在这里配置自己的扩展停止词字典-->
    <entry key="ext_stopwords">stopword.dic;</entry> 
</properties>

ext.dic is the extended lexicon, as follows, please note that the first line empty:


万和
江苏万和

stopword.dic stop is the lexicon, as follows, please note that the first line empty:


的
是
一个
  • Restart tomcat

2.2 Test

  • Test the extension thesaurus
    Extended thesaurus test
  • Testing stopped thesaurus
    Test stop Thesaurus
    These are the word of solr, stop words, and extended thesaurus.

Guess you like

Origin www.cnblogs.com/alichengxuyuan/p/12577251.html