value source in solr(lucene)

As mentioned in the previous blog, valueSource is to obtain some values ​​of a doc. This value can be the value of some fields of the doc, or it can be some function values ​​that are currently calculated. In this blog, I will talk about solr. valueSource that already exists.

 There are many implementation classes of valueSource. In the source code of solr, I saw:

  1. ConstantNumberSource, that is, all doc share a value source. This class has many subclasses. Let's look at it later.

  2. Regarding the distance, the valueSource used in the map calculation exists in the spatial package of lucene, because I did not use it, nor did I look at it.

  3. The valueSource of the length of the inverted list, idf, and tf,

     DocFreqValueSource: This is the valueSource that calculates the length of the inverted list of a specified term. In the construction method, the string of the domain and term to be queried must be passed in. The finally returned FunctionValues ​​is a ConstIntDocValues, and all the doc in it are the same value, that is, the length of the inverted list of the term to be queried, that is, the docFrequence.

     IDFValueSource: This is similar to the above DocFreqValueSource, except that instead of calculating the docFre of a term, it calculates the idf value of a term, and finally returns a constant FunctionValues.

     TermFreqValueSource: This is the value valueSource of the number of times that this term is contained in all docs in the inverted list of the specified term.

     TFValueSource: This is similar to the TermFreqValueSource above, except that it returns the last value calculated by the tf function, while the above is the true value of the returned term frequence without function calculation.

  4. DualFloatFunction, literally, this is a function of two-element floating point type. You can see from the source code that it is a valueSource that calculates the function value of two values ​​according to a certain function for two valueSources. The abstract class and its concrete implementation classes implement specific functions, such as addition, division, and multiplication. These simple functions are all brought by solr, which can be found in the code of solr's ValueSourceParser.

    Many of his implementations are static inner classes, so I will not record them here. There are: Divide ---- DivFloatFunction (the name in solr is div), that is, divide a value by Another value, subtractive, has the name ms in solr, and modulo, has the name mod in solr,

  5. FieldCacheSource, this is to get the value source from FieldCache, it is very simple. He has many implementation classes.

 

In solr, we finally use valueSource, but when we call solr's api, we can only pass strings, so we need a parser, which is consistent with the queryParser we use, parsing valueSource in solr The class is called valueSourceParser (analogous to queryParser). Let's take a look at this class to know what valueSources come with solr. In this class, there is a static map (analogous to the array in QParserPlugin) that implements the mapping from the name of the valueSourceParser to the instance.

/* standard functions */
public static Map<String, ValueSourceParser> standardValueSourceParsers = new HashMap<String, ValueSourceParser>();

 In his static block, many parsers will be initialized and placed in the array. When we use FunctionQuery in the future, when we finally calculate the score, we will find the specific valueSourceParser from this map according to the passed parameters, and then Use this parser to compute a valueSource for sorting. Of course, we can also customize valueSourceParser, as long as it inherits from this valueSourceParser, implements his parse(FunctionQParser) method, and finally returns a valueSource. Of course, you also need to configure it in solrConfig

 

<valueSourceParser name="myfunc" class="com.....MyValueSourceParser"/>  

In this way, we can implement our own valueSourceParser to implement the functions we need and solr does not provide. 

 

  

In addition, there is another place to pay attention to. In the facet of solr, in some cases, the valueSource will be used (for example, in the facet field, the valueSource will be used first), so how does the valueSource get the final value? There is such a method in the FieldType of solr: org.apache.solr.schema.FieldType.getValueSource(SchemaField, QParser), it can be found that the valueSource value is obtained from the FiledCache, that is to say, the valueSource obtained through FieldType in solr is used FieldCache, that is, it will be searched from docValue first, and if it cannot be found, it will be searched from the dictionary table.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=326171207&siteId=291194637