Recently, the company is using solr as a search engine, and then I also learned about solr, learned some solr-related knowledge, and recorded it and shared it.
Solr download address:
Note: After solr8.6 does not support data importing from the database to solr
You can take a look at the solr update log:
https://cwiki.apache.org/confluence/display/SOLR/Deprecations
The first is the installation of solr:
- https://www.cnblogs.com/guxiong/p/6284938.html
- https://www.cnblogs.com/edison20161121/p/7826907.html
The installation of solr is not carefully recorded here, basically it is to put the solr package under tomcat to start, and then put the jar required by solr, such as solr, ik, importdata (import data) package, and then create a solrHome, create For these core steps, you can Baidu it yourself.
Then create the index, we step into the topic and use solr with Spring Boot.
First we come to the admin page of solr
Let me first introduce the functions of the more important menus on the left.
Overview: Show some statistics and metadata.
Analysis: Help you design your Analyzer, Tokenizer and Filter.
Dataimport: Used to import the database into solr, the next article will introduce importing solr from the database
Documents: Provides a form window that allows you to add, delete, or modify the data in the Core.
Files: does not refer to the business data stored in the Core, but refers to the configuration files of the Core, such as solrconfig.xml.
Query: query solr (query, sort, highlight, etc.)
Ping: Click to see if the Core is still alive and the response time.
Plugins: Solr comes with some plug-ins and the information and statistics of the plug-ins we installed.
Replication: Shows the copy of your current Core and provides disable/enable functions.
Schema: Display the core's shema data. If you use the ManagedSchema mode, you can also modify, add, or delete schema fields through this page. (Compared with Solr4, it is a more practical improvement.)
Segments info: Shows the segment information of the underlying Lucence.
The following is a description of the query page
q | Query conditions |
fq | Filter condition |
sort | Sorting rules |
start,rows | Paging parameters |
hl | Whether to highlight |
hl.fl | Highlighted field, index |
hl.simple.pre | Highlighted prefix |
hl.simple.post | Highlighted suffix |
dismax | Used to modify the weight score |
edismax | Used to modify the weight score |
Both dismax and edismax are used to modify the weight and scoring, but there are some differences between the two
edismax supports the boost function and the score as a multiplication, and dismax can only use the bf effect is addition, so when dealing with multiple dimension sorting, score should actually be one of the dimensions, and the addition method is used to deal with adjustment troubles.
For details, please refer to: https://blog.csdn.net/duck_genuine/article/details/8060026
Then write the code according to the needs, haha.
First, there are the following fields:
- Course Title
- Name of lecturer
- Courseware name
- release time
- Credit type
Then the user enters a keyword to query, and the keyword is required to be segmented to match the name of the course, lecturer, and courseware, and then the matched data is highlighted and displayed according to the rules.
The rules are as follows:
The keyword matches to the display of the course name, then the one that matches the name of the lecturer is displayed behind it, and the one that matches the name of the courseware is displayed at the end.
Normal sort sorting certainly cannot meet our needs, so dismax is used
Briefly introduce the following dismax parameters.
q: original input string |
q.alt: Call the standard query parser and define the input string when the q parameter is empty |
qf: query fields, which fields are specific to, if the default is df by default. For example: qf="course_name^100 expert_name^75 course_ware_name^50", the following number represents the weight of the field, the larger the number, the greater the weight |
mm: The minimum query should match: mm is not defined and specified in solrconfig.xml by default. The default is 100% for full matching; mm is a positive integer to specify the minimum number of matches; mm is a negative integer to specify the minimum matching value minus the value; mm Specify a percentage to return all the results of the similarity; mm is a negative percentage, specify that this part can be ignored; mm is an expression such as: 3<90% means: 1-3 are all required, 4- is 90% required |
pf: phrase fields, use the same as qf, mainly used to increase the score of matching documents, and to distinguish similar query results |
ps: phrase slop, the purpose is to obtain a certain phrase, ps is the pf parameter. ps influence is enhanced, if you use ps value, numFound and result set do not change. But the order of the result set changes. |
qs: Query Phrase Slop: specifies the number of positions two terms can be apart in order to match the specified phrase. Used specifically with the qf parameter. |
tie: floating point number with default 0.0<1; |
bq: boost query extended q parameter |
bf: boost function For example: recip(rord(myfield),1,2,3)^1.5 |
In fact, our needs can be met with only q and qf.
The query results are as follows
We can also configure in the solr configuration file, and find the solrconfig.xml under this core in our solrhome
<requestHandler name="/select" class="solr.SearchHandler" default="true">
<lst name="defaults">
<str name="defType">dismax</str>
<!-- Change from JSON to XML format (the default prior to Solr 7.0)
<str name="wt">xml</str>
-->
<str name="pf">
courseName expertName courseWareName
</str>
<str name="qf">
courseName^100.0 expertName^50.0 courseWareName^10.0
</str>
</lst>
</requestHandler>
Then rebuild the index and restart solr. (You need to do this every time you modify the rules, which is more troublesome)
But in the end we must use springboot to write, the specific code is as follows
public ResposeResultViewModel querySolr(String userId,String courseName,Integer pageIndex,Integer pageSize) throws IOException, SolrServerException {
ResposeResultViewModel model = new ResposeResultViewModel();
UserInfoViewModel userInfo = userService.getUserInfoById(userId);
if(userInfo!=null){
DataPaginationViewModel dataModel = new DataPaginationViewModel();
if(pageIndex==null){
pageIndex=1;
}
if(pageSize==null){
pageSize=10;
}
SolrQuery solrQuery = new SolrQuery();
final Base64.Decoder decoder = Base64.getDecoder();
courseName = new String(decoder.decode(courseName), "UTF-8");
solrQuery.setQuery(courseName);
solrQuery.set("defType","dismax");
solrQuery.set("qf","course_name^100 expert_name^75 course_ware_name^50");
//开启高亮显示
solrQuery.setFilterQueries("organ_id:"+userInfo.getOrgan_id());
solrQuery.setHighlight(true);
//设置高亮颜色
solrQuery.addHighlightField("course_name");
solrQuery.addHighlightField("expert_name");
solrQuery.addHighlightField("course_ware_name");
solrQuery.setHighlightSimplePre("<font color='red'>");
solrQuery.setHighlightSimplePost("</font>");
solrQuery.setStart((pageIndex-1)*pageSize);
solrQuery.setRows(pageIndex*pageSize);
QueryResponse query = client.query(solrQuery);
SolrDocumentList results = query.getResults();
Map<String, Map<String, List<String>>> map = query.getHighlighting();
results.forEach(res->{
Map<String, List<String>> listMap = map.get(res.get("course_organ_assign_id"));
List<String> courseName1 = listMap.get("course_name");
List<String> courseWareName = listMap.get("course_ware_name");
List<String> expertName = listMap.get("expert_name");
res.put("course_name",courseName1==null?res.get("course_name"):courseName1.toString().substring(1,courseName1.toString().length()-1));
res.put("course_ware_name",courseWareName==null?res.get("course_ware_name"):courseWareName.toString().substring(1,courseWareName.toString().length()-1));
res.put("expert_name",expertName==null?res.get("expert_name"):expertName.toString().substring(1,expertName.toString().length()-1));
CourseMarkPojo courseMarkPojo= courseService.getCourseMarkAndStudyUserNumber(res.get("course_id").toString());
res.put("mark",courseMarkPojo.getMark());
res.put("study_user_number",courseMarkPojo.getStudy_user_sum());
});
long numFound = results.getNumFound();
dataModel.setTips("查询成功");
dataModel.setData_list(results);
dataModel.setData_count((int) numFound);
dataModel.setData_count(((int)numFound + pageSize - 1)/pageSize);
model.setCode(0);
model.setMessage("查询成功");
model.setBody(dataModel);
}else{
model.setCode(1);
model.setMessage("人员信息为空");
}
return model;
}