ikanalysis兼容solr-4.9.0

solr-4.9.0自带lucene-analyzers-smartcn-4.9.0.jar实现中文分词。可惜它的词库文件都是字节码的，添加自定义词库没有现成的java实现。

而IKAnalyzer正好支持明文文本词库，可惜它从2012年起就没有维护了。于是只能参考smartcn和旧版的IKAnalyzer实现再适配一个。

代码很简单，2个类就能适配好。

solr-ik-adapter.jar 懒人直接用这个jar包即可

IKAnalyzerFactory.rar 要看代码并编译的可下载这个

使用方法，定义filedtype时这个写即可

<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">  
      <analyzer type="index">  
      	<tokenizer class="org.wltea.analyzer.solr.IKTokenizerFactory"   />     
      	   
      	<!--以下内容可选-->
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> 
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />  
        <filter class="solr.LowerCaseFilterFactory"/>   
        <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt" />      
        <!--以下内容可选end-->           
      </analyzer>  
       
      <analyzer type="query">  
      	
        <tokenizer class="org.wltea.analyzer.solr.IKTokenizerFactory"/>    
        <!--以下内容可选-->
        <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>   
        <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" enablePositionIncrements="true" />  
        <filter class="solr.LowerCaseFilterFactory"/>         
        <filter class="solr.KeywordMarkerFilterFactory" protected="protwords.txt" />    
        <!--以下内容可选end-->           
      </analyzer>  
    </fieldType>

ikanalysis兼容solr-4.9.0

猜你喜欢