lucene4.3全文搜索引擎—为索引域的加权

上文讲到对索引的管理,增删查改。今晚就讲讲,为索引域添加"权"了,有时在搜索的时候,会根据需要的不同,对不同的关键值或者不同的关键索引分配不同的权值,让权值高的内容更容易被用户搜索出来,而且排在前面。

为索引域添加权是再创建索引之前,把索引域的权值设置好,这样,在进行搜索时,lucene会对文档进行评分,这个评分机制是跟权值有关的,而且其它情况相同时,权值跟评分是成正相关的。

private String[] ids={"1","2","3","4","5","6"};
	
	private String[] emails={"[email protected]","[email protected]","[email protected]","[email protected]","[email protected]","[email protected]"};
	
	private String[] contents={"hello,how are you,@163.com,p,lucene,lucene","hi,I am fine!,@163.com,p,lucene","what is your name,@163.com,p,lucene","my name is summer,@qq.com,p,lucene,lucene","what is your number,@sina.cn,p,lucene,lucene","I will tell you,just wait a minute,@gmail.com,p,lucene"};
	
	private String [] names={"sam","holiday","issac","summer","coco","roy"};
	
	private int[] attachs={2,3,2,4,5,7};
	
	private Directory directory=null;
	
	private Map<String,Float> powerScores=new HashMap<String,Float>();

	
	
	
	public IndexUtil() throws IOException
	{
		powerScores.put("@163.com", 2.0f);
		powerScores.put("@qq.com", 1.5f);
		
		directory=FSDirectory.open(new File("E:/lucene/index02"));
		
	}
	
	/**
	 * 建立索引
	 * @throws IOException
	 */
	public void index() throws IOException 
	{
		IndexWriter indexWriter=new IndexWriter(directory,new IndexWriterConfig(Version.LUCENE_43,new StandardAnalyzer(Version.LUCENE_43)));
		
		for(int i=0;i<ids.length;i++)
		{
			Document document=new Document();
			
			Field contentField=new Field("content",contents[i],Field.Store.NO,Field.Index.ANALYZED);
			document.add(new Field("id",ids[i],Field.Store.YES,Field.Index.NOT_ANALYZED_NO_NORMS));
			document.add(new Field("email",emails[i],Field.Store.YES,Field.Index.NOT_ANALYZED));
			document.add(new Field("name",names[i],Field.Store.YES,Field.Index.NOT_ANALYZED_NO_NORMS));
			document.add(contentField);
		
			
			String contentPower=contents[i].substring(contents[i].lastIndexOf("@"),contents[i].lastIndexOf("p")-1);
			
			System.out.println(contentPower);
			
			if(powerScores.containsKey(contentPower))
			{
				//为索引域添加权,例如,这里的例子是,如果这个域有包含@163.com的就为2.0,如果为@qq.com的就为1.5,如果为其它的就为0.5,默认值是1.0
				//3.5版本有个为文档添加权的,但是现在4.3版本已经没有了
				contentField.setBoost(powerScores.get(contentPower));
			}
			else
			{
				contentField.setBoost(0.5f);
			}
			
			
			indexWriter.addDocument(document);
		}
		
		if(indexWriter!=null) indexWriter.close();
	}

其实上面的代码跟上一篇索引的增删查改的代码几乎一样,笔者是在原有代码的基础上,加上一些对某个索引域权值的设置。至于测试的例子还是一样的,但是测试的结果会截然不同,笔者这里就不贴出测试结果了。

ps(纯属吐槽):因为博主白天还要上课,所以只能每天晚上抽出一点时间来弄这些文章了,之前因为有项目要做,连发文章的时间都被占有了,希望在这段时间能把这些文章快点弄好。好了,又差不多一点了,又是睡觉的节奏了,明天早上满课啊,惨啊!!

猜你喜欢

转载自zhh9106.iteye.com/blog/2036710
今日推荐