Step 1 : Before copying the following code to your programming software project, you also need to download two jar packages
IKAnalyzer2012.jar
lucene-core-3.6.0.jar
Baidu network disk download address
https://pan.baidu.com /s/1oGec_mqU7PdqkKdA-H4k0Q
Extraction code: 9egm
Second cloth : Copy the two jar packages to any file (or you can create a new lib folder and copy the two files into it)
Step 3 : Right-click the project to appear On the following page, click Configure Build path under Build Path...
Step 4 : After the third step, the page will pop up the following window, click Add jaRs...:
Step 5 : Then find the two jar packages you copied into the project, and click The Ctrl key can select two at the same time and add them together.
Step 6 : After adding, it looks like the following, the file icon becomes a small bottle, and finally click Apply to copy the code and run it.
Step 7 : Copy the code to your clss file and run it
package com.core.service.impl;(这里改成自己的包名)
import java.io.IOException;
import java.io.StringReader;
import org.apache.lucene.analysis.Analyzer;
import org.apache.lucene.analysis.TokenStream;
import
org.apache.lucene.analysis.tokenattributes.CharTermAttribute;
import org.wltea.analyzer.lucene.IKAnalyzer;
public class ChineseWordSeg {
public static void main(String[] args) throws IOException {
// 提前准备需要分词的语言
String t = "你好,我现在还刚刚接触数据结构,所以还不是太了解!";
// 创建一个分词对象
Analyzer a = new IKAnalyzer(true);
StringReader r = new StringReader(t);
// 对读入的语言开始进行分词操作
TokenStream to = a.tokenStream("", r);
// 获得CharTermAttribute类
CharTermAttribute te = to.getAttribute(CharTermAttribute.class);
// 依次遍历分词数据,注意要转换成字符串类型
while (to.incrementToken()) {
System.out.print(te.toString() + ",");
}
r.close();
System.out.println();
}
}
Original link: https://www.cnblogs.com/zhenyunboy/articles/13841075.html