solr入门实践

1, 运行solr程序

官方指南:https://lucene.apache.org/solr/guide/8_3/solr-tutorial.html
说明:在今日(2020,1,1),最新版为8.4.0 , 而国内的镜像都是最新版,但8.4的文档却没有,只好参照8.3版的文档,实践证明这两者基本一致

操作步骤 说明
下载package http://mirror.bit.edu.cn/apache/lucene/solr/8.4.0/solr-8.4.0.tgz
解压后启动 ./solr start -e cloud(此时进入配置环节:一路默热,最后选择config名称为“sample_techproducts_configs”)
//创建collection ./solr create -c gettingstarted -d sample_techproducts_configs -s 2 -rf 2 (等同于上面的交互界面创建collection,使用server/solr/configsets/sample_techproducts_configs里面的配置)
web端访问服务 访问(验证collection):http://localhost:8983/solr
向collection添加数据 ./post -c gettingstarted …/example/exampledocs/*
删除collection ./solr delete -c gettingstarted
停止solr服务 ./solr stop -all
  • 删除collection数据
#根据id删除一条记录
./post -c data1 -d "<delete><id>/en/45_2006</id></delete>"

#删除所有数据
./post -c data1 -d "<delete><query>*:*</query></delete>"  
  • solr配置文件
wang@wang-pc:~/unpack/solr-8.4.0/bin$ find ../ -name solrconfig.xml
../server/solr/configsets/_default/conf/solrconfig.xml
../server/solr/configsets/sample_techproducts_configs/conf/solrconfig.xml
../example/files/conf/solrconfig.xml
../example/example-DIH/solr/tika/conf/solrconfig.xml
../example/example-DIH/solr/solr/conf/solrconfig.xml
../example/example-DIH/solr/atom/conf/solrconfig.xml
../example/example-DIH/solr/db/conf/solrconfig.xml
../example/example-DIH/solr/mail/conf/solrconfig.xml

在这里插入图片描述

2, exercise 1: 查询语法

a, 全量查询: q=* : *

curl "http://localhost:8983/solr/gettingstarted/select?indent=on&q=*:*"

b, 单词过滤: q=foundation

curl "http://localhost:8983/solr/gettingstarted/select?q=foundation"

c, 等值查询(key:val): q=cat:electronics

curl "http://localhost:8983/solr/gettingstarted/select?q=cat:electronics"

d, 短语查询: q=“CAS+latency”

此时:+充当连字符

curl "http://localhost:8983/solr/gettingstarted/select?q=\"CAS+latency\""

e, 联合查询(and/or): +electronics +music

此时:+充当and
The encoding for + is %2B
The encoding for blank is %20 (两个+相连要空白符合,以区分短语查询)
require: +
require not: -

curl "http://localhost:8983/solr/gettingstarted/select?q=%2Belectronics%20%2Bmusic"

##下面的+-: 加号仍算连字符
curl "http://localhost:8983/solr/gettingstarted/select?q=%2Belectronics+-music"

3, exercise 2: faceting(分面查询)

准备工作:创建collection, 配置schemaless索引(自动类型推断)

#启动关闭的solr服务
./solr start -c -p 8983 -s ../example/cloud/node1/solr
./solr start -c -p 7574 -s ../example/cloud/node2/solr -z localhost:9983

#创建collection: film, 无预定义模式
solr create -c films -s 2 -rf 2

#创建普通字段:手动设置某字段(name)的数据类型text_general
curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": {"name":"name", "type":"text_general", "multiValued":false, "stored":true}}' http://localhost:8983/solr/films/schema

#创建copy field(多个复制到一个)字段: 所有的数据都复制到一个字段中
curl -X POST -H 'Content-type:application/json' --data-binary '{"add-copy-field" : {"source":"*","dest":"_text_"}}' http://localhost:8983/solr/films/schema

#加载example/films/数据:films.json, 或film.csv, 或film.xml
./post -c films ../example/films/films.json

##### 至此, 可以使用web ui工具来模糊查询了 ####

在这里插入图片描述

a,Field Facets (字段值分面): 相当于group by 某字段

查询条件如下:

  • (q=* : *)
  • (rows=0)
  • (facet=true/on)
  • (facet.field=genre_str)
curl "http://localhost:8983/solr/films/select?q=*:*&rows=0&facet=true&facet.field=genre_str"
#curl "http://localhost:8983/solr/films/select?=&q=*:*&facet.field=genre_str&facet.mincount=200&facet=on&rows=0"

返回数据:(相当于group by 某字段)
 "response":{"numFound":1100,"start":0,"maxScore":1.0,"docs":[]
  },
  "facet_counts":{
    "facet_queries":{},
    "facet_fields":{
      "genre_str":[
        "Drama",552,
        "Comedy",389,
        "Romance Film",270,
        "Thriller",259,

b,Range Facets (区间段分面): 相当于group by 各个区间

web ui 暂不支持此类查询,可以直接在浏览器url中输入此Range Facets URL

url=$(echo "http://localhost:8983/solr/films/select?
facet.range=initial_release_date&
facet.range.start=NOW-20YEAR&
facet.range.end=NOW&
facet.range.gap=%2B1YEAR&
facet=true&
q=*%3A*&
rows=0"   |xargs |sed "s/[[:space:]]//g" )

curl $url

返回数据
 "response":{"numFound":1100,"start":0,"maxScore":1.0,"docs":[]
  },
  "facet_counts":{
    "facet_queries":{},
    "facet_fields":{},
    "facet_ranges":{
      "initial_release_date":{
        "counts":[
          "1997-07-28T17:12:06.919Z",0,
          "1998-07-28T17:12:06.919Z",0,
          "1999-07-28T17:12:06.919Z",48,
          "2000-07-28T17:12:06.919Z",82,

c, Pivot Facets (中心点分面): 相当于group by 某中心字段 ,再内容下钻: group by 其他字段

web ui 暂不支持此类查询,可以直接在浏览器url中输入地址Pivot Facets URL

curl "http://localhost:8983/solr/films/select?q=*:*&rows=0&facet=on&facet.pivot=genre_str,directed_by_str"

返回数据
 "response":{"numFound":1100,"start":0,"maxScore":1.0,"docs":[]
  },
  "facet_counts":{
    "facet_queries":{},
    "facet_fields":{},
    "facet_ranges":{},
    "facet_intervals":{},
    "facet_heatmaps":{},
    "facet_pivot":{
      "genre_str,directed_by_str":[{
          "field":"genre_str",
          "value":"Drama",
          "count":552,
          "pivot":[{
              "field":"directed_by_str",
              "value":"Ridley Scott",
              "count":5},
            {
              "field":"directed_by_str",
              "value":"Steven Soderbergh",
              "count":5},
发布了276 篇原创文章 · 获赞 37 · 访问量 11万+

猜你喜欢

转载自blog.csdn.net/eyeofeagle/article/details/103790607