lucene使用facet搜索

facet,英文翻译为方面。Lucene中的facet查询其实就是对事物的方面查询。我们以手机举例。一个手机可以有品牌,型号,运营商等多个facet,不同的facet类型可以组合成不同的手机或者手机的集合。如品牌为小米,运营商为移动构成的就是移动发行的所有小米的手机型号(小米1,小米2,小米3)等。而品牌为小米,型号为小米4构成的手机集合就是小米四的所有运营商发行版(小米4移动版,小米4联通版,小米4电信版等)。我们在对一样事物的搜索时也时常使用这种方式,先确定手机品牌,再逐步对型号,运营商等方面进行限制最终得到想要的结果。下面介绍怎么在Lucene中如何使用facet

1.引入相关依赖

    <!--引入方面查询(facet search)的依赖-->
    <!-- https://mvnrepository.com/artifact/org.apache.lucene/lucene-facet -->
    <dependency>
      <groupId>org.apache.lucene</groupId>
      <artifactId>lucene-facet</artifactId>
      <version>7.2.1</version>
    </dependency>

2.建立普通索引的同时建立TaxonomyIndex(分类索引)

    @Test
    public void buildIndex() throws Exception{
        Directory directory = FSDirectory.open(Paths.get(indexDir));
        IndexWriter writer = new IndexWriter(directory, new IndexWriterConfig(new WhitespaceAnalyzer()));
        //使用DirectoryTaxonomyWriter写入进行切面查询所需要的Taxonomy索引
        Directory taxioDirectory = FSDirectory.open(Paths.get(taxoDir));
        DirectoryTaxonomyWriter taxoWriter = new DirectoryTaxonomyWriter(taxioDirectory);

        FacetsConfig config = new FacetsConfig();

        Document doc = new Document();
        doc.add(new TextField("device", "手机", Field.Store.YES));
        doc.add(new TextField("name", "米1", Field.Store.YES));
        doc.add(new FacetField("brand", "小米"));
        doc.add(new FacetField("network", "移动4G"));
        //写入索引的同时写入taxo索引
        writer.addDocument(config.build(taxoWriter, doc));

        doc = new Document();
        doc.add(new TextField("device", "手机", Field.Store.YES));
        doc.add(new TextField("name", "米4", Field.Store.YES));
        doc.add(new FacetField("brand", "小米"));
        doc.add(new FacetField("network", "联通4G"));
        writer.addDocument(config.build(taxoWriter, doc));

        doc = new Document();
        doc.add(new TextField("device", "手机", Field.Store.YES));
        doc.add(new TextField("name", "荣耀6", Field.Store.YES));
        doc.add(new FacetField("brand", "华为"));
        doc.add(new FacetField("network", "移动4G"));
        writer.addDocument(config.build(taxoWriter, doc));

        doc = new Document();
        doc.add(new TextField("device", "电视", Field.Store.YES));
        doc.add(new TextField("name", "小米电视2", Field.Store.YES));
        doc.add(new FacetField("brand", "小米"));
        writer.addDocument(config.build(taxoWriter, doc));

        writer.close();
        taxoWriter.close();
    }

3.按维度进行细分查询,同时获取维度相关信息

/**
     * 对facet查询进行测试
     * @throws Exception
     */
    @Test
    public void testFacetSearch() throws Exception{
        Directory directory = FSDirectory
                .open(Paths.get(indexDir));
        DirectoryReader indexReader = DirectoryReader.open(directory);
        IndexSearcher searcher = new IndexSearcher(indexReader);
        //同时还需要taxonomy reader
        Directory taxoDirectory = FSDirectory
                .open(Paths.get(taxoDir));
        TaxonomyReader taxoReader = new DirectoryTaxonomyReader(taxoDirectory);
        FacetsConfig config = new FacetsConfig();
        //相应的Collector是必不可少的
        FacetsCollector facetsCollector = new FacetsCollector();

        //按照手机这个维度查询
        System.out.println("---------手机----------");
        TermQuery query = new TermQuery(new Term("device", "手机"));
        TopDocs docs = FacetsCollector.search(searcher, query, 10, facetsCollector);
        printDocs(docs, searcher);
        System.out.println("----------facet-----------");
        Facets facets = new FastTaxonomyFacetCounts(taxoReader, config, facetsCollector);
        List<FacetResult> results = facets.getAllDims(10);
        //打印其他维度信息
        for (FacetResult tmp : results){
            System.out.println(tmp);
        }

        System.out.println("=======================");

        //2.drill down,品牌选小米
        System.out.println("-----小米手机-----");
        DrillDownQuery drillDownQuery = new DrillDownQuery(config, query);
        drillDownQuery.add("brand", "小米");
        FacetsCollector fc1 = new FacetsCollector();//要new新collector,否则会累加
        docs = FacetsCollector.search(searcher, drillDownQuery, 10, fc1);
        printDocs(docs, searcher);
        System.out.println("----------facet-----------");
        facets = new FastTaxonomyFacetCounts(taxoReader, config, fc1);
        results = facets.getAllDims(10);
        //获得小米手机的分布,总数2个,网络:移动4G 1个,联通4G 1个
        for (FacetResult tmp : results) {
            System.out.println(tmp);
        }
        System.out.println("=======================");

        //3.drill down,在brand这个facet选择了小米之后继续选择另一个方面network为移动4G
        System.out.println("-----移动4G小米手机-----");
        //可以看到使用的是同一个DrillDownQuery
        drillDownQuery.add("network", "移动4G");
        FacetsCollector fc2 = new FacetsCollector();
        docs = FacetsCollector.search(searcher, drillDownQuery, 10, fc2);
        printDocs(docs, searcher);
        System.out.println("----------facet-----------");
        facets = new FastTaxonomyFacetCounts(taxoReader, config, fc2);
        results = facets.getAllDims(10);
        for (FacetResult tmp : results) {
            System.out.println(tmp);
        }
        System.out.println("=======================");

        //使用sideWay查看其它平行维度的信息
        System.out.println("-----小米手机drill sideways-----");
        DrillSideways ds = new DrillSideways(searcher, config, taxoReader);
        DrillDownQuery drillDownQuery1 = new DrillDownQuery(config, query);
        drillDownQuery1.add("brand", "小米");
        DrillSideways.DrillSidewaysResult result = ds.search(drillDownQuery1, 10);
        docs = result.hits;
        printDocs(docs, searcher);
        System.out.println("----------facet-----------");
        results = result.facets.getAllDims(10);
        for (FacetResult tmp : results) {
            System.out.println(tmp);
        }
        System.out.println("=======================");

        indexReader.close();
        taxoReader.close();

    }

查询结果如下所示,可以看到随着维度的细分查询结果逐渐精确

---------手机----------
device:手机
name:米1

device:手机
name:米4

device:手机
name:荣耀6

----------facet-----------
dim=brand path=[] value=3 childCount=2
  小米 (2)
  华为 (1)

dim=network path=[] value=3 childCount=2
  移动4G (2)
  联通4G (1)

=======================
-----小米手机-----
device:手机
name:米1

device:手机
name:米4

----------facet-----------
dim=brand path=[] value=2 childCount=1
  小米 (2)

dim=network path=[] value=2 childCount=2
  移动4G (1)
  联通4G (1)

=======================
-----移动4G小米手机-----
device:手机
name:米1

----------facet-----------
dim=brand path=[] value=1 childCount=1
  小米 (1)

dim=network path=[] value=1 childCount=1
  移动4G (1)

=======================
-----小米手机drill sideways-----
device:手机
name:米1

device:手机
name:米4

----------facet-----------
dim=brand path=[] value=3 childCount=2
  小米 (2)
  华为 (1)

dim=network path=[] value=2 childCount=2
  移动4G (1)
  联通4G (1)

=======================

猜你喜欢

转载自blog.csdn.net/m0_37556444/article/details/82749114