lucene 搜索功能介绍（1）

首先使用搜索功能前需要先创建索引：

/**
* 创建索引
* @author 王晨
*
*/
public class Indexer {

private IndexWriter writer; //写索引实例

/**
* 构造方法传入索引所在的文件夹
* @param indexDir
*/
public Indexer(String indexDir) throws Exception{
Directory dir = FSDirectory.open(Paths.get(indexDir)); //获取到索引所在的文件路径
Analyzer analyzer = new StandardAnalyzer(); //标准分词器
IndexWriterConfig conf = new IndexWriterConfig(analyzer);
writer = new IndexWriter(dir,conf);
}

/**
* 关闭写索引（像流一样需要关闭）
* @throws Exception
*/
public void close() throws Exception{
writer.close();
}

/**
* 对每个文件进行遍历一个一个的索引（索引指定目录中的文件）
* @param indexDir
* @return
* @throws Exception
*/
public int index(String indexDir) throws Exception{
File[] files = new File(indexDir).listFiles(); //获取到当前目录下的所有文件
for(File file : files){
indexFile(file);
}
return writer.numDocs(); //返回索引的文件个数
}

/**
* 索引指定文件
* @param file
* @throws Exception
*/
private void indexFile(File file) throws Exception{
// TODO Auto-generated method stub
System.out.println("当前索引的文件："+file.getCanonicalPath());
Document doc = getDocument(file);
writer.addDocument(doc);
}

//获取文档文档里在设置每个字段文档中每一行为一个document
/**
* 获取文档
* @param file
* @return
* @throws Exception
*/
private Document getDocument(File file) throws Exception{
// TODO Auto-generated method stub
Document doc = new Document();
doc.add(new TextField("contents", new FileReader(file)));
doc.add(new TextField("filename", file.getName(), Field.Store.YES)); //将文件名加入到索引中
doc.add(new TextField("fullPath", file.getCanonicalPath(), Field.Store.YES));

return doc;
}

public static void main(String[] args) {
String indexDir = "D:\\lucene";
String dataDir = "D:\\lucene\\data";
Indexer indexer = null;
int numIndexed = 0;
long start = System.currentTimeMillis();
long end = 0;
try {
indexer = new Indexer(indexDir); //输出索引的目录
numIndexed = indexer.index(dataDir); //构建索引返回构建索引的个数
end = System.currentTimeMillis();
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}finally{
try {
indexer.close();
System.out.println("索引了的文件个数"+numIndexed+"一共花费了时间"+(end-start)+"毫秒");
} catch (Exception e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
}
}

}

搜索功能的实现使用两种方法对特定的项进行搜索或使用查询表达式QueryParser

①准备工作

private Directory dir;

private IndexReader reader;

private IndexSearcher is;

@Before

public void setUp() throws Exception {

dir = FSDirectory.open(Paths.get("D:\\lucene")); //索引所在的目录

reader = DirectoryReader.open(dir); //读取索引

is = new IndexSearcher(reader);

}

@After

public void tearDown() throws Exception {

reader.close();

}

首先介绍对特定的项进行搜索（此方法并不常用）TermQuery

/**
* termQuery 查询对特定项进行搜索必须完全匹配才会查出来
* @throws Exception
*/
@Test
public void testTermQuery() throws Exception{
String searchField = "contents"; //在哪个字段查询
String str = "java"; //用户要查询的字段
Term t = new Term(searchField, str);
Query query = new TermQuery(t);
TopDocs hits = is.search(query, 10);
for(ScoreDoc scoreDoc : hits.scoreDocs){
Document document = is.doc(scoreDoc.doc);
System.out.println(document.get("fullPath"));
}

}

使用解析查询表达式进行搜索 QueryParser

/**
* 解析查询表达式 queryParser 若需要两个字段匹配其中的一个使用空格空开即可两个都匹配使用AND连接使用~可以通用匹配 particula~
* @throws Exception
*/
@Test
public void testQueryParser() throws Exception{
Analyzer analyzer = new StandardAnalyzer(); //标准分词器
String searchField = "contents";
String str = "TermQuery"; // java AND php jav~ java php
QueryParser parser = new QueryParser(searchField,analyzer); //查询解析
Query query = parser.parse(str);
TopDocs hits = is.search(query, 10);
System.out.println("匹配 "+str+" 共有"+hits.totalHits+"个记录");
for(ScoreDoc scoreDoc : hits.scoreDocs){
Document doc = is.doc(scoreDoc.doc);
System.out.println(doc.get("fullPath"));
}

}

如有需要使用分页的功能比如需要分10页每页10条有两种实现方法：由于lucene没有提供分页的功能。

①使用is.search()查询出100条数据，放在内存当中比如放在list里面每次点击下一页返回

不同的数据

②使用is.search()查询出100条数据，每次点击重新的获取到100条数据在for循环中返回不同的数据

推荐使用第二种方法：is.search()每次查询速度很快。而且当并发量很大时全部存在内存当中会对内存造成很大的压力容易出现问题。

lucene 搜索功能介绍（1）

猜你喜欢