solr3.5同时配置paoding,IKAnalyzer,mmseg4j三种分词器词库

solr同时配置三种中文分词器到schema.xml。
关键点在三种分词器的词库配置。


先下载三种不同版本的分词器
mmseg4j-1.8.5.zip;
IKAnalyzer3.2.8 bin.zip
paoding-analysis-2.0.4-beta.zip
solr版本:3.5
web服务器:tomcat6

开始配置到schema.xml
(1)mmseg4j-1.8.5.zip
关键点在dicPath

<fieldType name="随便叫" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">     <tokenizer class="com.chenlb.mmseg4j.solr.MMSegTokenizerFactory" mode="max-word" dicPath="zip包中的dic文件夹,dic放哪里就指定哪里"/>
	<filter SynonymFilterFactory StopFilterFactory WordDelimiterFilterFactory...很多filter/>		
</analyzer>
	<analyzer type="query">
			<tokenizer class="com.chenlb.mmseg4j.solr.MMSegTokenizerFactory" mode="max-word" dicPath="zip包中的dic文件夹,dic放哪里就指定哪里"/>
<filter SynonymFilterFactory StopFilterFactory WordDelimiterFilterFactory...很多filter/>		</fieldType>

(2)paoding-analysis-2.0.4-beta.zip 
词库信息在zip包中的dic文件夹,把整个文件夹拷入TOMCAT_HOME/webapps\solr\WEB-INF\classes;
把zip包中的src下的所有properties文件放入同样位置;
很关键一点,每次修改了dic文件夹中的dic文件,必须删除.compiled文件夹,重启后会再次生成
<fieldType name="随便叫" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">    <tokenizer class="net.paoding.analysis.analyzer.solr.ChineseTokenizerFactory" mode="most-words" />	<filter SynonymFilterFactory StopFilterFactory WordDelimiterFilterFactory...很多filter/>		
</analyzer>
	<analyzer type="query">
			<tokenizer class="net.paoding.analysis.analyzer.solr.ChineseTokenizerFactory" mode="most-words" /><filter SynonymFilterFactory StopFilterFactory WordDelimiterFilterFactory...很多filter/>		</fieldType>

(3)IKAnalyzer3.2.8 bin.zip
把zip包中的IKAnalyzer.cfg.xml文件拷入TOMCAT_HOME/webapps\solr\WEB-INF\classes;
zip包中的ext_stopword.dic文件拷入同样位置,可以使用任意*.dic词库,但必须修改成mydict.dic文件名,放入同样位置;
修改IKAnalyzer.cfg.xml,放开注释即可,打开都能看明白
<fieldType name="随便叫" class="solr.TextField" positionIncrementGap="100">
      <analyzer type="index">  <tokenizer class="org.wltea.analyzer.solr.IKTokenizerFactory"  isMaxWordLength="false"/>  	<filter SynonymFilterFactory StopFilterFactory WordDelimiterFilterFactory...很多filter/>		
</analyzer>
	<analyzer type="query">
			<tokenizer class="org.wltea.analyzer.solr.IKTokenizerFactory"  isMaxWordLength="false"/>  <filter SynonymFilterFactory StopFilterFactory WordDelimiterFilterFactory...很多filter/>		</fieldType>



如果有一些帮助,来个关注吧,马上会加上热门搜索词汇,搜索提示等实例功能

猜你喜欢

转载自ren00317574.iteye.com/blog/1880673