Solr multi-field query sorting

Recently, the company is using solr as a search engine, and then I also learned about solr, learned some solr-related knowledge, and recorded it and shared it.

Solr download address:

Note: After solr8.6 does not support data importing from the database to solr

You can take a look at the solr update log:

https://cwiki.apache.org/confluence/display/SOLR/Deprecations

The first is the installation of solr:

The installation of solr is not carefully recorded here, basically it is to put the solr package under tomcat to start, and then put the jar required by solr, such as solr, ik, importdata (import data) package, and then create a solrHome, create For these core steps, you can Baidu it yourself.

Then create the index, we step into the topic and use solr with Spring Boot.

First we come to the admin page of solr

Let me first introduce the functions of the more important menus on the left.

Overview: Show some statistics and metadata.

Analysis: Help you design your Analyzer, Tokenizer and Filter.

Dataimport: Used to import the database into solr, the next article will introduce importing solr from the database

Documents: Provides a form window that allows you to add, delete, or modify the data in the Core.

Files: does not refer to the business data stored in the Core, but refers to the configuration files of the Core, such as solrconfig.xml.

Query: query solr (query, sort, highlight, etc.)

Ping: Click to see if the Core is still alive and the response time.

Plugins: Solr comes with some plug-ins and the information and statistics of the plug-ins we installed.

Replication: Shows the copy of your current Core and provides disable/enable functions.

Schema: Display the core's shema data. If you use the ManagedSchema mode, you can also modify, add, or delete schema fields through this page. (Compared with Solr4, it is a more practical improvement.)

Segments info: Shows the segment information of the underlying Lucence.

 

The following is a description of the query page

q Query conditions
fq Filter condition
sort Sorting rules
start,rows Paging parameters
hl Whether to highlight
hl.fl Highlighted field, index
hl.simple.pre Highlighted prefix
hl.simple.post Highlighted suffix
dismax Used to modify the weight score
edismax Used to modify the weight score

 Both dismax and edismax are used to modify the weight and scoring, but there are some differences between the two

edismax supports the boost function and the score as a multiplication, and dismax can only use the bf effect is addition, so when dealing with multiple dimension sorting, score should actually be one of the dimensions, and the addition method is used to deal with adjustment troubles.

For details, please refer to: https://blog.csdn.net/duck_genuine/article/details/8060026

Then write the code according to the needs, haha.

First, there are the following fields:

  • Course Title
  • Name of lecturer
  • Courseware name
  • release time
  • Credit type

Then the user enters a keyword to query, and the keyword is required to be segmented to match the name of the course, lecturer, and courseware, and then the matched data is highlighted and displayed according to the rules.

The rules are as follows:

The keyword matches to the display of the course name, then the one that matches the name of the lecturer is displayed behind it, and the one that matches the name of the courseware is displayed at the end.

Normal sort sorting certainly cannot meet our needs, so dismax is used

Briefly introduce the following dismax parameters.

q: original input string
q.alt: Call the standard query parser and define the input string when the q parameter is empty
qf: query fields, which fields are specific to, if the default is df by default. For example: qf="course_name^100 expert_name^75 course_ware_name^50", the following number represents the weight of the field, the larger the number, the greater the weight
mm: The minimum query should match: mm is not defined and specified in solrconfig.xml by default. The default is 100% for full matching; mm is a positive integer to specify the minimum number of matches; mm is a negative integer to specify the minimum matching value minus the value; mm Specify a percentage to return all the results of the similarity; mm is a negative percentage, specify that this part can be ignored; mm is an expression such as: 3<90% means: 1-3 are all required, 4- is 90% required
 pf: phrase fields, use the same as qf, mainly used to increase the score of matching documents, and to distinguish similar query results
 ps: phrase slop, the purpose is to obtain a certain phrase, ps is the pf parameter. ps influence is enhanced, if you use ps value, numFound and result set do not change. But the order of the result set changes.
 qs:  Query Phrase Slop: specifies the number of positions two terms can be apart in order to match the specified phrase. Used specifically with the qf parameter.
 tie: floating point number with default 0.0<1;
 bq: boost query extended q parameter
 bf: boost function For example: recip(rord(myfield),1,2,3)^1.5

 

 

 

 

 

 

In fact, our needs can be met with only q and qf.

 

The query results are as follows

We can also configure in the solr configuration file, and find the solrconfig.xml under this core in our solrhome

  <requestHandler name="/select" class="solr.SearchHandler" default="true">
    <lst name="defaults">
		<str name="defType">dismax</str>
       <!-- Change from JSON to XML format (the default prior to Solr 7.0)
          <str name="wt">xml</str> 
         -->
		 <str name="pf">
		courseName expertName courseWareName
		</str>
		<str name="qf">
		courseName^100.0 expertName^50.0 courseWareName^10.0
		</str>
    </lst>
  </requestHandler>

Then rebuild the index and restart solr. (You need to do this every time you modify the rules, which is more troublesome)

But in the end we must use springboot to write, the specific code is as follows

public ResposeResultViewModel querySolr(String userId,String courseName,Integer pageIndex,Integer pageSize) throws IOException, SolrServerException {
        ResposeResultViewModel model = new ResposeResultViewModel();
        UserInfoViewModel userInfo = userService.getUserInfoById(userId);
        if(userInfo!=null){
            DataPaginationViewModel dataModel = new DataPaginationViewModel();
            if(pageIndex==null){
                pageIndex=1;
            }
            if(pageSize==null){
                pageSize=10;
            }
            SolrQuery solrQuery = new SolrQuery();
           final Base64.Decoder decoder = Base64.getDecoder();
            courseName = new String(decoder.decode(courseName), "UTF-8");
            solrQuery.setQuery(courseName);
            solrQuery.set("defType","dismax");
            solrQuery.set("qf","course_name^100 expert_name^75 course_ware_name^50");
            //开启高亮显示
            solrQuery.setFilterQueries("organ_id:"+userInfo.getOrgan_id());
            solrQuery.setHighlight(true);
            //设置高亮颜色
            solrQuery.addHighlightField("course_name");
            solrQuery.addHighlightField("expert_name");
            solrQuery.addHighlightField("course_ware_name");
            solrQuery.setHighlightSimplePre("<font color='red'>");
            solrQuery.setHighlightSimplePost("</font>");
            solrQuery.setStart((pageIndex-1)*pageSize);
            solrQuery.setRows(pageIndex*pageSize);
            QueryResponse query = client.query(solrQuery);
            SolrDocumentList results = query.getResults();
            Map<String, Map<String, List<String>>> map = query.getHighlighting();
            results.forEach(res->{
                Map<String, List<String>> listMap = map.get(res.get("course_organ_assign_id"));
                List<String> courseName1 = listMap.get("course_name");
                List<String> courseWareName = listMap.get("course_ware_name");
                List<String> expertName = listMap.get("expert_name");
                res.put("course_name",courseName1==null?res.get("course_name"):courseName1.toString().substring(1,courseName1.toString().length()-1));
                res.put("course_ware_name",courseWareName==null?res.get("course_ware_name"):courseWareName.toString().substring(1,courseWareName.toString().length()-1));
                res.put("expert_name",expertName==null?res.get("expert_name"):expertName.toString().substring(1,expertName.toString().length()-1));
                CourseMarkPojo courseMarkPojo= courseService.getCourseMarkAndStudyUserNumber(res.get("course_id").toString());
                res.put("mark",courseMarkPojo.getMark());
                res.put("study_user_number",courseMarkPojo.getStudy_user_sum());
            });
            long numFound = results.getNumFound();
            dataModel.setTips("查询成功");
            dataModel.setData_list(results);
            dataModel.setData_count((int) numFound);
            dataModel.setData_count(((int)numFound + pageSize - 1)/pageSize);
            model.setCode(0);
            model.setMessage("查询成功");
            model.setBody(dataModel);
        }else{
            model.setCode(1);
            model.setMessage("人员信息为空");
        }

        return model;
    }

 

Guess you like

Origin blog.csdn.net/ChenLong_0317/article/details/111473438