IKAnalyzer use

1. analyzer analyzer all classes are ultimately inherited Analyzer
1.1 default standard analyzer: StandardAnalyzer
when we create the index, we used to IndexWriterConfig objects, in the process we create the index, go through the step of analyzing the document, that is, step word default analyzer using standard automatic segmentation


1.2 analysis of the effect of the analyzer view


public static void main (String [] args) throws IOException {
// Create a analyzer objects. 1.
analyzer analyzer StandardAnalyzer new new = ();
// 2. call Analyzer object tokenStream method to get TokenStream object that contains all of the segmentation result
TokenStream tokenStream = analyzer.tokenStream ( "", "at the a, Comprehensive Programming the Provides the Spring Framework and the Configuration Model.");
. // 3 to tokenStream objects set a pointer, the pointer to where the current on which a word
CharTermAttribute charTermAttribute = tokenStream.addAttribute (CharTermAttribute.class);
.. 4 // call the object's methods tokenStream rESET reset pointer, does not call being given
tokenStream.rese T ();
// 5 Using the while loop, get word list results incrementToken method returns false if the representative value of true indicates no read has been read is completed
while (tokenStream.incrementToken ()) {
System.out.println (charTermAttribute.toString ( ));
}
// close 6.
tokenStream.close ();

}


Default standard analyzer analysis in English is no problem, but how he analyzes the Chinese?


Chinese 1.2 Analyzer
to third Chinese Analyzer: IKAnalyzer
IKAnalyzer using steps:
1. Import dependent
<- - https://mvnrepository.com/artifact/com.jianggujin/IKAnalyzer-lucene!>
<Dependency>
<the groupId> com.jianggujin </ the groupId>
<the artifactId> IKAnalyzer-Lucene </ the artifactId>
<Version> 8.0.0 </ Version>
</ dependency>
2. IKAnalyzer configuration, import configuration file
hotword.dic expansion dictionary, the network can be stylish noun put them to the dictionary, this word can be extended according to the dictionary
stopword.dic stop word dictionary, meaningless words and sensitive words can be put to the dictionary to them, so that when the analysis of the content will be ignored

among the custom dictionary and stop word dictionary expansion process, do not use Notepad to edit the windows, because windows Notepad is UTF-8 + BOM coding

3. IKAnalyzer segment words
public static void main (String [] args) IOException {throws
// 1. create an Object Analyzer
= New new IKAnalyzer Analyzer Analyzer ();
. // 2 calls Analyzer object tokenStream method to get TokenStream object that contains all of the segmentation result
TokenStream tokenStream = analyzer.tokenStream ( "", " Wudaokou class factory install mysql-5.7.22 after -winx64 database service startup error: mysql after the service started on the local computer is stopped, when certain services are not used by another service or program will automatically stop and mysql official website to download the archive decompression did not come out on the wire installation teach ... Bowen from: novice road test, huh ");
// 3 tokenStream to set a pointer to the object, where the current pointer which points in the word.
CharTermAttribute charTermAttribute = tokenStream.addAttribute (CharTermAttribute.class);
// call tokenStream. 4. reset method object pointer reset, do not call being given
tokenStream.reset ();
.. 5 // use while loop, to get the word list results incrementToken method returns the value true if the read completion means no false reading is representative of completion
the while (tokenStream.incrementToken ()) {
System.out.println (charTermAttribute. toString ());
}
// close. 6.
tokenStream.close ();

}
4. programs which use IKAnalyzer
the IndexWriter IndexWriter the IndexWriter new new = (Directory, new new IndexWriterConfig (new new IKAnalyzer ()));

 

Guess you like

Origin www.cnblogs.com/danxun/p/12363152.html
use
use