"Suggesting Natural Method Names to Check Name Consistencies" Paper Reading Summary

" Suggesting Natural Method Names to Check Name Consistencies " Paper Reading Summary

”Stay Hungry Stay young“

@lizzy_0323

1. The method proposed

1.1 Background

Misleading names of methods in the construction of projects or software libraries often make developers unable to accurately understand the functions of the program and the usage of APIs, thus leading to misuse of APIs.

Therefore, this article introduces MNIRE-a machine learning method to check the consistency of a given method name and its implementation. It first generates a candidate name to compare with the current name. If the two names are sufficiently similar, it is considered The methods are consistent. In the process of generating method names, it is found that the mark ratio of method names is very high, which can be found in the three contexts (subject, interface, and closed class name) of a given method, even if these identifiers are not Exist, the identifier can also be predicted by the context. The unique idea of ​​this article is to treat the name generation as an abstract summary of the identifier collected from the program entity names in the above three contexts

1.2 Method test

By testing the data set of 14M methods, when detecting inconsistent methods, MNIRE increased the recall rate and accuracy by 10.4% and 11% respectively. In the method name recommendation, MNIRE improved the recall rate and accuracy respectively. An increase of 18.2% and 11.1%.

Second, the introduction of the method

2.1 The importance of the method

Naming conventions and coding standards are very important, and misleading names of APIs can confuse software developers. Researchers have introduced automatic tools to verify the consistency of method names and ontology. The key idea is: methods with similar subjects should have similar names. However, in the research, they found that because two methods with similar names are used in different Their ontologies are often not similar. In the direction of information retrieval (IR), tool search can only search for the names of methods with similar subjects to recommend different methods, but cannot propose a new name.

2.2 The method is superior to other points

The key idea of ​​the code2vec method is: two methods (such as for and while) that have a similar AST structure (abstract syntax tree) implementation can perform the same task, so the same name can be given, the important thing is that code2vec cannot be formed for the method The new name.

Allamanis et al. used a neural network model to project all the names in the method body and the identifiers in the method name into the same vector space. Their model selects the nearest word in the vector space to organize and generate a new name, the name of the program entity Essentially different from the method name, the name of the entity has complete meaning, and the individual identifiers of the method name have different meanings, so they should not be mapped in the same space

Through a series of experiments, it can be found that 62.9% of the method names are unique, and 78.1% of the words can be found in the method names that can be seen earlier, so the method name prediction model should be used at the level of each word in the method name , Instead of using it on the subject of method name. In addition, the name of the method can be found in the subject of the method, interface and the outer class of the method. When all the words in the name of the method are encountered, in 35.9% of the cases, we A word can be found in the name of the method, although it cannot be found, the context can also be used to predict the words in the method name, because these words have a high probability of co-occurring. The interface reflects the input and output, and the packaging class reflects the general context of the task that implements the method. Through these natural principles, it is reasonable to train statistical models on a large number of corpora.

2.3 Introduction to the method

Based on the above research, the method name generation problem is regarded as the abstract text summary problem. Each individual sentence is the order in the article. The method name is decomposed into a sequence of tokens, and the token is generated as a summary of the input sentence , This method can create an abstract summary.

In terms of the structure of the model, the model in this article chooses the Encoder-Decoder model, which summarizes the meaning of the sentence by statistically encoding the input. This model is used to capture the context of the sentence and use different words in the Restate them in a short sequence. This is the name of the prediction method

Three, use cases

3.1 Method names are inconsistent

In view of the inconsistency of methods, two typical scenarios are proposed:

1. When a method is given a confusing name, inconsistencies occur;

2. The inconsistency between method name and method function occurs in the continuous update of the software.

3.2 Generate a better name based on the method

According to the method, a more accurate name can be derived. This name can be used to detect whether the current name is consistent, or a better name can be provided in the name naming stage to generate a good name, which often depends on the following factors:

1. The method name of the purpose of the abstract method is related to the name of the program entity used to implement the method and the description of the method function. This relationship comes from two levels. The first is that the name of the method and the variables and fields in the body, as well as the invocation of the method have a certain connection with the function of the method. The second level is that the words of the name of the good method and the words of the program entity in the topic often appear at the same time (mentioned above).

2. The type of parameters and the return value of the method are also part of the method declaration. They describe the input and output of the method, and have a huge impact on the use of other calls to the method

3. In object-oriented programming, method m defines a manifestation of objects belonging to class C. Object o of class C can be considered as the subject of this action when executing m, so the class name also has the name of the inferred method The role of

4. Case study

4.1 Uniqueness and size characteristics of the method

In the data set, there are 3402,550 unique method names and 120,303 unique words in method names. On average, each method name has 2.64 words, the median is 3 words, and the longest method name is 83 identifiers. At the same time, the number of identifiers in the method body is 17.3 times the method name, and most method names are much shorter than the body.
Most method names are actually unique, and the non-unique cases are due to the large number of common names. Conversely, the identifiers that make up the method name often consist of previously seen words.

4.2 The connection between method name and context

  1. Common identifier shared by method name and context: In the three contexts, there are on average about two of the three tags that can find the method name, and the method name is likely to be found in the text (62.%). Is interface (14.9%), peripheral type (6.1%)

  2. The universality of the method name and context shared words:

    84.6% of methods have method names that 33.3% of words can be found between contexts, 79.8% of methods have at least half of the identifiers that can be found, and 36.7% of methods consist of all identifiers in context, so , The percentage of methods with the same name as the entity name is high.

    3. Conditions for the identifier in the method to appear in the context:

​ For a method, the condition that occurs is calculated as a conditional probability, the calculation formula is:

Insert picture description here
The numerator is the number of methods containing t, and the denominator is the number of methods whose context is consistent with C. The higher the probability, the stronger the ability of the context to predict word t

Five, the characteristics of the model

5.1 Key Concepts of the Model

1. The naturalness of identifier names
2. Summary

5.2 Text extraction

First extract the text of implementation, interface, and enclosing class, and mark them as IMP, INF, ENC text respectively, and then link all context sentences to form a sequential representation of three contexts, which are divided by ".", INF, input and output Split by ",", for IMP and INF, the names and types of identifiers are arranged in the original order in the code. Experiments in random order found that the order of names/types does not affect the results.

5.3 Abstract generalization model

The Seq2Seq-based architecture is used in MNIRE. The model is based on the attention mechanism. The model structure is as followsInsert picture description here

In this model, the input of the encoder is the embedded vector x=(x1,x2,...,xm) text sentence, and the sentence is encoded as a hidden representation h=(h1,h2,...,hm), the decoder is responsible for The probability of the method name is predicted by y = (y1, y2, …, yk) based on the h vector. The probability of each y is based on the decoding vector s of the recurrent neural network (RNN), the y predicted in the previous stage, and the context vector C, is calculated as follows:
Insert picture description here
C vector called attention vector, s, and it is based on the calculation of the hidden layer h is calculated as follows: ,,Insert picture description here

Among them, Insert picture description here
is an attention function, which is used to calculate the unnormalized decoder and encoder alignment scores. In general, the context vector c helps the decoder decide which sentences and which parts to focus on which steps Generate y.

5.4 Method name consistency check

In order to check the consistency, we calculated the similarity between p and c Sim(p,c), p comes from MNIRE, c is the method name of the current stage, the similarity value is in the range of 0 to 1, defined as p The calculation formula for the part of the word shared with c is as follows:
Insert picture description here
the consistency of method m is determined by a different threshold T, if this value is less than T, MNIRE can classify c as inconsistent, otherwise it can be classified as consistent Method realization.

6. Evaluation settings, procedures and metrics

Comparative study: For each application of MCC and MNR, each model under study was trained with its own training data set, and then tested with the corresponding test data set.
Background analysis: For each application, in order to study the impact of different contexts, we created different variants of MNIRE with different context combinations and measured performance.
Sensitivity analysis: For each application, the influence of the following factors is studied: representation, similarity threshold, context, and data size. Change them and measure performance.
Metrics: For MCC, the predicted cases were compared with the consistent and inconsistent method names provided as part of the MCC corpus. For MNR, we compared the predicted name with the name of a good method in MNRoracle. This method is built in code2vec. In order to measure the performance of MCC, four performances of precision, recall, f-score and accuracy are used. Measurement, for inconsistent methods:
the calculation formula for accuracy is: Insert picture description here
recall rate:
Insert picture description here
for consistent methods:
accuracy:
Insert picture description here
recall rate: Insert picture description here
F-score:
Insert picture description here
precision:
Insert picture description here

For specific analysis of these performance metrics, you can refer to related materials and blogs.


The calculation formula for the accuracy of the original name e and the predicted name r is: the calculation formula for the
Insert picture description here
recall rate is;
Insert picture description here
token(n) represents the number of words in the sentence n

Seven, experience results

7.1 Accuracy comparison

1. Accuracy in method name check (MCC):
For inconsistent method names:
MNIRE's recall rate has increased by 10.4%, and its accuracy has increased by 10.8%.
MNIRE uses the name of the program entity. The principle of this method is that method The name of the implementation and their implementation of similar methods in the ontology should be similar, and vice versa.
For a consistent method name: MNIRE detects a more consistent method name than the original method name, which improves the recall rate by 16.6% and the accuracy by 9%.
2. The accuracy of the recommended method name (MNR): MNIRE method Compared with code2vec, it improves the recall rate of 18.2% and the accuracy of 11.1%. Based on the higher recall rate, MNIRE can predict more words correctly. Based on the higher accuracy, the predicted words are correct The ratio of words is higher because MNIRE uses richer texts, such as closed texts, and its name and method name have a strong correlation.
3. Accuracy at the level of generating new method names: Explore the performance of MNIRE on recommended method names. These method names are not in the training data and can still predict unknown method names well, which shows that it learns the recommended method names Instead of retrieving the content trained in the corpus
4. Accuracy in the size of the test set method: MNIRE is very effective in methods with regular sizes, but it decreases as the length of the method increases.

7.2 Text analysis results

After using the interface and implemented text assistance, the accuracy is improved in both. For the MNR problem, the accuracy and recall are significantly improved, and for the MCC problem, there is also a slight improvement.
Compared with IMP and INF, IMP+ENC has a lower degree of improvement than IMP. The reason is that INF and method names have more common identifiers, and the number of words in INF and ENC is much smaller than IMP. Therefore, the improvement of IMP Small improvement

7.3 Sensitivity analysis results

1. Parse the code and build different representations before using the seq2seq model:

  1. lexeme: all words are collected
  2. AST: The sequence of tokens in the AST when the input of the seq2seq model is used, and the tree structure is encoded with a separator
  3. Graph: The method body is constructed as a PDG, and it will be converted into a vector using the grappg2vec tool. Enter seq2seq
    if the two methods have the same AST. They may not necessarily have the same vocabulary tags, so the Lexeme model has stricter similarity conditions, so the accuracy is higher, and the recall rate is lower. However, the Graph model has lower similarity conditions than the Tree model, and the F-score of the Tree model Higher than the Graph model
    2. The effect of context length on accuracy;
    the longer the context, the better the effect of the MNIRE model
    3. The effect of the length of each word in the context on accuracy:
    the more meaningful words, the higher the accuracy
    4 The impact of the size of the data set on accuracy:
    the size of the data set increases, the accuracy increases

7.4 Time complexity

On the MCC issue: MNIRE training time is much lower than code2vec. This is because MNIRE does not need to construct abstract syntax tree nodes, so MNIRE is more efficient

8. Conclusion

This article introduces a machine learning method-MNIRE, which is used to predict the name of the method and the name of the detection method is inconsistent. The conclusion is that in order to predict a good name, you need to rely on the naturalness of the program entity in the text, which is better than the structure of AST or PDG Better, secondly, each method name is quite unique, but each identifier often appears. Therefore, MNIRE uses the regularity of the identifier in the program entity to generate the predicted method name. Finally, the generated method is more effective in predicting new names. The IR-based corpus search method is more effective

Guess you like

Origin blog.csdn.net/weixin_45717055/article/details/112688166