Blog quality score calculation - release version 5

1. Background

As the name suggests, blog quality score is used to measure the quality of a blog, which plays a key role in CSDN's hot list, recommendation, search and other modules. The following figure is the working mechanism of the quality score:
insert image description here

Figure 1 Working mechanism of quality score

To review first, in the quality score of version 4 (subsequently called V4), the score is mainly smoothed, so that the quality score results are distributed more evenly, and will not be excessively concentrated in the head [80, 100] and tail [0 , 20), see blog for details .

However, there is no obvious hierarchical structure (also called interpretability) in the V4 quality score system, that is, bloggers add new elements (such as pictures, links, codes, etc.) After adding some elements that affect reading in the blog post (for example: dead links, false links, code confusion, etc.), the quality score did not decrease step by step.

In addition, although the quality score distribution of V4 is more uniform, it is still not enough. See Figure 5 for details. This figure randomly counts the distribution of quality scores of 10,000 blogs, and the histogram in blue is the distribution of quality scores of V4. It can be seen that the scores are mainly distributed in two intervals [0, 20] and [50, 94].

In response to the above problems, the quality of the fifth version (subsequently called V5) has undergone a series of improvements, while ensuring that about 90% of the high-quality blog data in V4 is retained (that is, blogs with a score of more than 80 in V4 have about 90% in V5). 90% are still above 80), the scores are more evenly distributed, and the quality points hierarchy is clearer.

Next, the improvements to Quality Score in Version 5 will be elaborated.

2. Quality score version 5

2.1 Version 4 Existing Problem Analysis

The following figure is the V4 version quality score calculation process:
insert image description here

Fig. 2 Flow chart of quality score calculation in V4 version

As can be seen from Figure 2, the V4 version quality score calculation process has the following problems:

  • In the positive cumulative score, the catalog and the standard catalog are two different items, and there is redundancy;
  • In sigmoid normalization, the sigmoid function maps the score to (0.5, 0.938), so that scores below 0.5 and 0.938 rarely appear. In addition (here, the score is the score before scaling, and the value is [0, 1]) ;
  • sigmoid normalization should be placed in the final stage, not in the middle stage;
  • Whether the article has votes can be directly put into the positive cumulative score;
  • In the calculation logic, there are only positive accumulation points (also known as bonus items) and penalty items, but no subtraction items. From the point of view of the completeness of the scoring system design logic, it should be divided into the following three parts:
    • Bonus points: From 0 to 1, gradually add points;
    • Subtraction items: gradually reduce points from 1 to 0;
    • Strong penalty factor: For serious violations, directly multiply by a lower penalty factor, for example: 0.1, 0.2.

In addition, in addition to the above problems in the calculation process, the quality score of the V4 version also has the following problems on multiple scoring items:

  • Code score: It can be understood as the code score, in English lines of code, that is, the number of lines of code, and the calculation method in the V4 version is the number of tokens in the code;
  • The sub-score items are not smooth enough: Take the content length score as an example, Min–max normalization (code shown below) is used. This normalization method has truncation. If the input value is greater than the maximum value, the score will not be changes, and it's not smooth enough, for example:
    • Content length score, if the maximum article length is 2000, the length of article 1 is 2000, and the length of article 2 is 3000, then the score of article 1 and article 2 is the same.

    • Table of contents score, table of contents score is calculated based on the number of multi-level subheadings in the text, the more headings, the higher the score. The V4 version directly uses Min–max normalization, where min = 0 and max = 10. This normalization method is not smooth enough. If there are only 2 subheadings in a blog post, the score is only 0.2, but the original intention of the quality score is to encourage users to use multi-level headings correctly. Some articles do not need too many multi-level headings, maybe 2 Only one subheading can divide the structure of the article clearly, that is, two subheadings should get a score above 0.5.

      def min_max_normalization(value, max_value, min_value):
          if value > max_value:
              value = max_value
      
          if value < min_value:
              value = min_value
      
          if min_value > max_value:
              tmp_value = max_value
              max_value = min_value
              min_value = tmp_value
      
          norm_value = (value - min_value) / (max_value - min_value)
      
          return norm_value
      

2.2 version 5 improvements

In response to the problems in the V4 version, the V5 version has been improved accordingly. The improved calculation process is shown in the figure below:
insert image description here

Figure 3 V5 version quality score calculation flow chart

As can be seen from Figure 3, the V5 version has made the following improvements to the calculation process of the V4 version:

  • Merge the catalog and the standard catalog to unify the catalog score;
  • When sigmoid is smoothed, it is placed in the last stage, and the score range of the new sigmoid function mapping is (0.017, 0.983). Compared with the previous (0.5, 0.938), the score distribution is more uniform. The function image is as follows:
    insert image description here
Figure 4. Score smoothing functions of V4 and V5 versions
  • Whether the article has votes is directly placed in the bonus item;
  • The calculation logic is directly divided into three parts: bonus items, subtraction items, and strong penalty items;
  • Newly added non-IT technical articles minus points
  • The structure of the article is too simple, from a strong penalty factor to a deduction item, because the simple structure of the article has been reflected in the article content length score and tag diversity score , so there is no need for strong punishment;
  • Add image score ;
  • Optimize the weight of each sub-point bonus item, so that the score increases or decreases in a stepwise manner;
  • The score in the last step of the calculation process is rounded int(score * 100)to round(score * 100), because int() in python is rounded down by default, and round is rounded up.

In addition, in addition to the optimization of the calculation process, the V5 version also optimizes the calculation logic of each word score item, as follows:

  • Code score is directly measured by the number of lines of code and the number of code blocks;
  • For the problem that multiple sub-score items are not smooth enough, the use of the Min–max normalization function is reduced in V5, and a piecewise function or other smoother curves are used instead. The V5 version has smoothed and fine-tuned the calculation logic for multiple sub-scores such as content length score , directory score , code score , content length score , link score , and image score , for example:
    • For the truncation problem in content-length scores , use the piecewise function to handle
      def __cal_content_length_score(self, content):
        """ 计算内容长度得分 v5
        """
        content_len_base = self.content_len_range["max"] / 2
        content_len_cut_off_point = sigmoid(self.content_len_range["max"] / content_len_base)
      
        content_len = len(content)
      
        # 分段函数,平滑内容长度较大时的得分
        if content_len <= self.content_len_range["max"]:
            score = min_max_normalization(
                content_len, self.content_len_range["max"], self.content_len_range["min"])
            score *= content_len_cut_off_point
        else:
            score = sigmoid(content_len / content_len_base)
      
        return score
      
    • For the problem of insufficient smoothness in catalog scores , use a power function for smoothing
      def __cal_heads_toc_score(self, sample):
        """ 计算目录得分 v5
        
        """
        # 1. 正文中的多级标题 (即 h1, h2, h3, h4) 得分
        heads_list = sample["catalog"]
        heads_num = len(heads_list)
      
        # 平滑,当 heads_num 较小时,得分变化不至于过小
        heads_score = min(math.pow(heads_num / self.heads_num_para["max"], 0.25), 1)
      
        # 2. toc 得分
        if "toc" in sample:
            toc_score = 1.0
        else:
            toc_score = 0.0
      
        # 3. 加权
        score = heads_score * self.heads_toc_weight["heads"] \
            + toc_score * self.heads_toc_weight["toc"]
      
        return score
      

2.3 Ablation Analysis

This paper conducted some ablation experiments to test the influence of various factors in each V5 version:

2.3.1 Positive Positive Score Ablation Experiment

By gradually removing a factor that affects the quality score, observe the change in the quality score.

In the table below, quality score-V5-sigmoid indicates the final quality score of the V5 version, and quality score-V5-base indicates the quality score of the V5 version before sigmoid smoothing. Due to the characteristics of the sigmoid function (see Figure 4 for the function image), it will smooth out the difference between the high-segmentation and low-segmentation scores, and the difference in the middle-segmentation will be more obvious, which also conforms to a common-sense assumption: "The higher the score , the harder it is to improve the score".

Therefore, in order to observe the impact of each element on the quality score, just compare the quality score-V4 and the quality score-V5-base . From the scores marked in red in the table below, we can see that the V5 version is better than the V4 version, and the V5 version can better reflect that the quality score decreases stepwise with the reduction of elements.

blog Quality score-V4 Quality score-V5-base Quality Score-V5-sigmoid length title picture Link Table of contents Standard Catalog the code vote element diversity
Test Blog 1 91 97 98 1 1 1 1 1 1 1 1 8
Test Blog 2 91 92 97 1 1 1 1 1 1 1 0 8
Test Blog 3 89 86 95 1 1 1 1 1 1 0 0 7
Test Blog 4 83 84 95 1 1 1 1 1 0 0 0 6
Test Blog 5 82 81 93 1 1 1 1 0 0 0 0 6
Test Blog 6 79 75 90 1 1 1 0 0 0 0 0 4
Test Blog 7 78 70 84 1 1 0 0 0 0 0 0 4
Test Blog 8 76 65 78 1 0 0 0 0 0 0 0 4
Test Blog 9 76 64 77 0.75 0 0 0 0 0 0 0 4
Test Blog 10 76 62 76 0.5 0 0 0 0 0 0 0 4
Test Blog 11 68 46 55 0.25 0 0 0 0 0 0 0 4
Test Blog 12 10 23 2 0 0 0 0 0 0 0 0 2

2.3.2 Forward cumulative score univariate experiment

Observe the quality score changes by removing only one factor that affects the quality score one at a time.

The reason is the same as above, directly compare the quality score-V4 and the quality score-V5-base . From the scores marked in red in the table below, we can see that the V5 version is better than the V4 version. Every time a factor is removed, the score of the V5 version decreases more obviously.

blog link Quality score-V4 Quality score-V5-base Quality Score-V5-sigmoid length title picture Link Table of contents Standard Catalog the code
Test Blog 1 91 92 97 1 1 1 1 1 1 1
Test Blog 2 89 86 95 1 1 1 1 1 1 0
Test Blog 3 86 90 97 1 1 1 1 1 0 1
Test Blog 4 91 88 96 1 1 1 1 0 1 1
Test Blog 5 91 88 96 1 1 1 0 1 1 1
Test Blog 6 91 87 96 1 1 0 1 1 1 1
Test Blog 7 90 87 96 1 0 1 1 1 1 1
Test Blog 8 91 92 97 0.75 1 1 1 1 1 1
Test Blog 9 90 91 97 0.5 1 1 1 1 1 1
Test Blog 10 89 86 95 0.25 1 1 1 1 1 1
Test Blog 11 15 52 58 0.1 1 1 1 1 1 1

2.3.3 Non-high-scoring article ablation experiment

The above two comparative experiments used high-scoring articles. Due to the strong influence of sigmoid on high-segment smoothness, it is difficult to see the difference. In order to further prove the advantages of the V5 version, the following table is a comparison experiment of non-high-scoring articles. It can be seen that the quality score of the V5 version changes more smoothly. The most obvious example is the data in row 1 and row 2. When using the V4 version test, on the basis of the 12-point blog in row 1, increase the length of the blog word by word, and the quality score will jump from 12 points to 70 points. , while the changes in the V5 version are uniform.

blog Quality score-V4 Quality score-V5
Case Study of Software Testing - Leap Year 1 12 30
A Case Study of Software Testing - Leap Year 2 70 54
Case Study of Software Testing - Leap Year 3 67 41
A Case Study of Software Testing - Leap Year 4 71 60
A Case Study of Software Testing - Leap Year 4.1 73 71
A Case Study of Software Testing - Leap Year 4.2 84 84
A Case Study of Software Testing - Leap Year 4.2 (plus vote) 85 89
A Case Study of Software Testing - Leap Year 5 83 95

2.4 Comparison of quality score distribution between V4 and V5 versions

Randomly sample 10,000 blog data, and compare the quality score distribution of V4 and V5 versions. As can be seen from the figure below, compared with the V4 (blue part) version, the distribution of the V5 version (red part) is more uniform, and the score coverage is wider. .
insert image description here

Figure 5 Comparison of quality score distribution between V4 and V5 versions

3. Summary

The V5 version of the quality score has been greatly updated. From the comparison experiment in Section 2.3, it can be seen that compared with the V4 version, as the content of the article changes, the score change of the V5 version is more uniform and reasonable. At the same time, from the distribution comparison in Section 2.4, it can be seen that the score distribution of the V5 version is more uniform and the distribution coverage is wider. A further benefit of these changes is that changes in quality scores are more interpretable .

In addition to the proactive optimization mentioned above, several hidden bugs were discovered during the process of modifying the code, which further ensured the correctness of the quality score calculation.

In addition, in order to avoid the impact of the new version V5 on the high-quality blogs (above 80 points) in the V4 version, through function transformation, it is guaranteed that about 90% of the high-quality blogs in the V4 version are still above 80 points.

Finally, I hope that all users will give you more valuable suggestions. Your suggestions are the driving force for our continuous optimization in the future, thank you!

4. Reference

Guess you like

Origin blog.csdn.net/u010280923/article/details/131449478