文章目录
1, 运行solr程序
官方指南:https://lucene.apache.org/solr/guide/8_3/solr-tutorial.html
说明:在今日(2020,1,1),最新版为8.4.0 , 而国内的镜像都是最新版,但8.4的文档却没有,只好参照8.3版的文档,实践证明这两者基本一致
操作步骤 | 说明 |
---|---|
下载package | http://mirror.bit.edu.cn/apache/lucene/solr/8.4.0/solr-8.4.0.tgz |
解压后启动 | ./solr start -e cloud(此时进入配置环节:一路默热,最后选择config名称为“sample_techproducts_configs”) |
//创建collection | ./solr create -c gettingstarted -d sample_techproducts_configs -s 2 -rf 2 (等同于上面的交互界面创建collection,使用server/solr/configsets/sample_techproducts_configs里面的配置) |
web端访问服务 | 访问(验证collection):http://localhost:8983/solr |
向collection添加数据 | ./post -c gettingstarted …/example/exampledocs/* |
删除collection | ./solr delete -c gettingstarted |
停止solr服务 | ./solr stop -all |
- 删除collection数据
#根据id删除一条记录
./post -c data1 -d "<delete><id>/en/45_2006</id></delete>"
#删除所有数据
./post -c data1 -d "<delete><query>*:*</query></delete>"
- solr配置文件
wang@wang-pc:~/unpack/solr-8.4.0/bin$ find ../ -name solrconfig.xml
../server/solr/configsets/_default/conf/solrconfig.xml
../server/solr/configsets/sample_techproducts_configs/conf/solrconfig.xml
../example/files/conf/solrconfig.xml
../example/example-DIH/solr/tika/conf/solrconfig.xml
../example/example-DIH/solr/solr/conf/solrconfig.xml
../example/example-DIH/solr/atom/conf/solrconfig.xml
../example/example-DIH/solr/db/conf/solrconfig.xml
../example/example-DIH/solr/mail/conf/solrconfig.xml
2, exercise 1: 查询语法
a, 全量查询: q=* : *
curl "http://localhost:8983/solr/gettingstarted/select?indent=on&q=*:*"
b, 单词过滤: q=foundation
curl "http://localhost:8983/solr/gettingstarted/select?q=foundation"
c, 等值查询(key:val): q=cat:electronics
curl "http://localhost:8983/solr/gettingstarted/select?q=cat:electronics"
d, 短语查询: q=“CAS+latency”
此时:+充当连字符
curl "http://localhost:8983/solr/gettingstarted/select?q=\"CAS+latency\""
e, 联合查询(and/or): +electronics +music
此时:+充当and
The encoding for + is %2B
The encoding for blank is %20 (两个+相连要空白符合,以区分短语查询)
require: +
require not: -
curl "http://localhost:8983/solr/gettingstarted/select?q=%2Belectronics%20%2Bmusic"
##下面的+-: 加号仍算连字符
curl "http://localhost:8983/solr/gettingstarted/select?q=%2Belectronics+-music"
3, exercise 2: faceting(分面查询)
准备工作:创建collection, 配置schemaless索引(自动类型推断)
#启动关闭的solr服务
./solr start -c -p 8983 -s ../example/cloud/node1/solr
./solr start -c -p 7574 -s ../example/cloud/node2/solr -z localhost:9983
#创建collection: film, 无预定义模式
solr create -c films -s 2 -rf 2
#创建普通字段:手动设置某字段(name)的数据类型text_general
curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": {"name":"name", "type":"text_general", "multiValued":false, "stored":true}}' http://localhost:8983/solr/films/schema
#创建copy field(多个复制到一个)字段: 所有的数据都复制到一个字段中
curl -X POST -H 'Content-type:application/json' --data-binary '{"add-copy-field" : {"source":"*","dest":"_text_"}}' http://localhost:8983/solr/films/schema
#加载example/films/数据:films.json, 或film.csv, 或film.xml
./post -c films ../example/films/films.json
##### 至此, 可以使用web ui工具来模糊查询了 ####
a,Field Facets (字段值分面): 相当于group by 某字段
查询条件如下:
- (q=* : *)
- (rows=0)
- (facet=true/on)
- (facet.field=genre_str)
curl "http://localhost:8983/solr/films/select?q=*:*&rows=0&facet=true&facet.field=genre_str"
#curl "http://localhost:8983/solr/films/select?=&q=*:*&facet.field=genre_str&facet.mincount=200&facet=on&rows=0"
返回数据:(相当于group by 某字段)
"response":{"numFound":1100,"start":0,"maxScore":1.0,"docs":[]
},
"facet_counts":{
"facet_queries":{},
"facet_fields":{
"genre_str":[
"Drama",552,
"Comedy",389,
"Romance Film",270,
"Thriller",259,
b,Range Facets (区间段分面): 相当于group by 各个区间
web ui 暂不支持此类查询,可以直接在浏览器url中输入此Range Facets URL
url=$(echo "http://localhost:8983/solr/films/select?
facet.range=initial_release_date&
facet.range.start=NOW-20YEAR&
facet.range.end=NOW&
facet.range.gap=%2B1YEAR&
facet=true&
q=*%3A*&
rows=0" |xargs |sed "s/[[:space:]]//g" )
curl $url
返回数据
"response":{"numFound":1100,"start":0,"maxScore":1.0,"docs":[]
},
"facet_counts":{
"facet_queries":{},
"facet_fields":{},
"facet_ranges":{
"initial_release_date":{
"counts":[
"1997-07-28T17:12:06.919Z",0,
"1998-07-28T17:12:06.919Z",0,
"1999-07-28T17:12:06.919Z",48,
"2000-07-28T17:12:06.919Z",82,
c, Pivot Facets (中心点分面): 相当于group by 某中心字段 ,再内容下钻: group by 其他字段
web ui 暂不支持此类查询,可以直接在浏览器url中输入地址Pivot Facets URL
curl "http://localhost:8983/solr/films/select?q=*:*&rows=0&facet=on&facet.pivot=genre_str,directed_by_str"
返回数据
"response":{"numFound":1100,"start":0,"maxScore":1.0,"docs":[]
},
"facet_counts":{
"facet_queries":{},
"facet_fields":{},
"facet_ranges":{},
"facet_intervals":{},
"facet_heatmaps":{},
"facet_pivot":{
"genre_str,directed_by_str":[{
"field":"genre_str",
"value":"Drama",
"count":552,
"pivot":[{
"field":"directed_by_str",
"value":"Ridley Scott",
"count":5},
{
"field":"directed_by_str",
"value":"Steven Soderbergh",
"count":5},