What should I do if the knowledge of the large model is Out? The Zhejiang University team explored the method of updating the parameters of large models—model editing

Original Author of Xi Xiaoyao Technology Talk
| Xiaoxi, Python

Behind the huge volume of the large model lies an intuitive question: " How should the large model be updated ?"

Under the extremely huge computational overhead of the large model, updating the knowledge of the large model is not a simple "learning task". Ideally, with the complex changes in the world, the large model should also keep up with The pace of the times, but the computational burden of training a new large model does not allow the large model to achieve instant updates. Therefore, a new concept "Model Editing (model editing)" came into being to realize the model data in a specific field. A valid change without adversely affecting the results of other inputs .

The concept of model editing was proposed by Mitchell et al. in 2022. As shown in the figure above, the entire Model Editing process aims to use the editing descriptor ( xe , ye ) (x_e,y_e)(xe,ye) , that is, the answer to the question about the information "Who is the President of the United States?-Biden" in the above figure is correct, to adjust the basic modelf θ f_{\theta}fi, and finally get an edited model f θ e f_{\theta_e}fie, and can be used f θ e ( xe ) = ye f_{\theta_e}(x_e)= y_efie(xe)=ye

On the other hand, model editing also requires that the "editing" in the current domain will not affect the normal output results of inputs in other domains . The formal expression requires:

f θ e ( x ) = { y e  if  x ∈ I ( x e , y e ) f θ ( x )  if  x ∈ O ( x e , y e ) f_{\theta_e}(x)= \begin{cases}y_e & \text { if } x \in I\left(x_e, y_e\right) \\ f_\theta(x) & \text { if } x \in O\left(x_e, y_e\right)\end{cases} fie(x)={ yefi(x) if xI(xe,ye) if xO(xe,ye)

Among them, III表示( xe , ye ) (x_e,y_e)(xe,ye) , the "effective neighbors" ofOOO means beyond( xe , ye ) (x_e,y_e)(xe,ye) field of action. An edited model should meet the following three points, namely reliability, universality and locality. The reliability means that the edited model should be able to correctly output the wrong examples of the pre-edited model, which can be determined by the average accuracy of the edited cases. Measurement, universality means for( xe , ye ) (x_e,y_e)(xe,ye) , the model should be able to give the correct output, which can be measured by uniform sampling of the edited case field dataset to measure the average correct rate, and finally locality, which means that the edited model is beyond the edited range In the example, the accuracy rate before editing should still be maintained. The locality can be described by measuring the average accuracy rate before editing and after editing respectively. As shown in the figure below, when editing the position of "Trump", some other public Characteristics should not be changed. Meanwhile, other entities, such as "Secretary of State", despite sharing similar characteristics to "President", should not be affected either.

The paper introduced today from Zhejiang University, from the perspective of a large model, described in detail the problems, methods and future of model editing in the era of large models, and constructed a new benchmark data set and evaluation indicators , to help evaluate existing technologies more fully and definitively, and to provide meaningful decision-making advice and insights for the community on method selection :

Thesis title:
Editing Large Language Models: Problems, Methods, and Opportunities

Paper link:
https://arxiv.org/pdf/2305.13172.pdf

Large model research test portal

ChatGPT Portal (wall-free, can be tested directly):

https://yeschat.cn

GPT-4 Portal (free of wall, can be tested directly, in case of browser warning point advanced/continue to visit):

https://gpt4test.com

mainstream method

The current model editing methods for large-scale language models (LLMs) can be mainly divided into two types of paradigms as shown in the figure below, which are the use of additional parameters while keeping the original model parameters unchanged as shown in figure (a) below and the figure below ( b) Internal parameters of the modified model shown .

First, let’s look at the relatively simple method of adding additional parameters. This method is also known as the memory-based or memory-based model editing method. The representative method SERAC first appeared in Mitchell’s paper on “Model Editing”. Its core idea is to keep the original parameters of the model unchanged . change, and reprocess the modified facts through an independent parameter set . Specifically, this type of method generally first adds a "range classifier" to judge whether the new input is within the range of the "re-edited" facts, and if so, The input is then processed using an independent set of parameters, giving a higher probability of selection to the "correct answer" in the cache. On the basis of SERAC, T-Patcher and CaliNET introduce additional trainable parameters (instead of an additional external model) to the feedforward module of PLMs. These parameters are trained in the modified fact data set to achieve the effect of model editing. .

The other method is to modify the parameters in the original model, mainly using a ∆ matrix to update some parameters in the model θ \thetaθ , specifically, the method of modifying parameters can be divided into two types of methods: "Locate-Then-Edit" and meta-learning. It can also be seen from the name that the Locate-Then-Edit method first locates the main influencing parameters in the model , and then modify the located model parameters to realize model editing. The main methods such as the Knowledge Neuron method (KN) determine the main influencing parameters by identifying the "knowledge neurons" in the model, and update the model by updating these neurons. Another method called ROME is similar to KN in that it uses causal mediation analysis to locate editing regions. In addition, there is a MEMIT method that can update a series of editing descriptions. The biggest problem with this type of method is that it is generally based on an assumption of locality of factual knowledge, but this assumption has not been widely verified, and the editing of many parameters may lead to unexpected results.

The meta-learning method is different from the Locate-Then-Edit method. The meta-learning method uses the hyper network method, using a hyper network to generate weights for another network . Specifically, in the Knowledge Editor method, the author uses a bidirectional LSTM predicts the update of model weights brought by each data point, so as to realize the constrained optimization of editing target knowledge. This kind of knowledge editing method is difficult to apply to LLMs due to the huge amount of parameters of LLMs. Therefore, Mitchell et al. proposed MEND (Model Editor Networks with Gradient Decomposition) so that a single edit description can effectively update LLMs. This update The method mainly uses the low-rank decomposition of the gradient to fine-tune the gradient of the large model, so that the update of the LLMs can be performed with minimal resources. Unlike Locate-Then-Edit methods, meta-learning methods usually take longer and consume more memory cost .

method evaluation

These different methods are used in two mainstream datasets for model editing, ZsRE (question-answering dataset, rewriting questions generated using back-translation as valid domains) and COUNTERFACT (counterfactual dataset, replacing subject entities with synonymous entities as valid domains) As shown in the figure below, the experiment is mainly aimed at two relatively large LLMs T5-XL (3B) and GPT-J (6B) as the basic model. An efficient model editor should be based on model performance, inference speed and Balance between storage space .

Comparing the results of fine-tuning (FT) in the first column, it can be found that SERAC and ROME perform well on the ZsRE and COUNTERFACT datasets , especially SERAC, which achieved more than 90% results on multiple evaluation indicators, although MEMIT's general The performance is not as good as SERAC and ROME, but it performs well in reliability and locality . The T-Patcher method is extremely unstable , with good reliability and locality in the COUNTERFACT dataset, but lacks generality. In GPT-J, the reliability and generality are excellent, but the locality is not good. . It is worth noting that the performance of KE, CaliNET, and KN is relatively poor. Compared with the good performance of these models in the "small model", the experiment may prove that these methods are not very suitable for the environment of large models .

From the perspective of time, once the network is trained, KE and MEND perform quite well, while methods such as T-Patcher are too time-consuming :

From the perspective of memory consumption, most methods consume memory in the same order of magnitude, but methods that introduce additional parameters will bear additional memory overhead :

At the same time, the operation of model editing usually needs to consider batch input editing information and sequential input editing information, that is, updating multiple fact information at one time and updating multiple fact information sequentially. The overall model effect of batch input editing information is shown in the figure below. It can be seen that MEMIT can support editing more than 10,000 pieces of information at the same time, and can also ensure that the performance of the two metrics remains stable, while MEND and SERAC perform poorly :

In terms of sequential input, SERAC and T-Patcher perform well and are stable. ROME, MEMIT, and MEND all show a phenomenon that the model performance drops rapidly after a certain number of inputs :

Finally, the author found in the research that the construction and evaluation indicators of these current data sets largely focus on the changes in sentence wording, but do not go deep into the changes made by model editors to many relevant logical facts , such as if "Watts Humphrey What college did you go to?" was changed from Trinity College to Michigan University, so obviously if we ask the model "Which city did Watts Humphrey live in at college?", the ideal model should answer Ann Arbor instead of Hartford, so , The authors of the paper introduce a "portability" metric on top of the first three evaluation metrics to measure the effectiveness of the edited model in terms of knowledge transfer .

To this end, the authors constructed a new dataset using GPT-4 by combining the original question sss answer fromooo changeo ∗ o^{*}o , and construct another correct answer aso ′ ∗ o^{'*}oproblemr ∗ r^*r , composition( o ∗ , r ∗ , o ′ ∗ ) (o^{*},r^*,o^{'*})(o,r,o)triplet, input to the edited model( o ∗ , r ∗ ) (o^{*},r^*)(o,r ), if the model can correctly outputo ′ ∗ o^{'*}oproves that the edited model is “portable”, and according to this method, the paper tested the portability scores of several existing methods as shown in the figure below:

It can be seen that almost most of the model editing methods are not ideal in terms of portability. The portability accuracy of SERAC, which once performed well, is less than 10%, and the relatively best ROME and MEMIT are only about 50%. This shows that the current model editing methods are almost difficult to achieve any expansion and promotion of edited knowledge, and model editing still has a long way to go .

discussion and future

In any sense, the problem of model editing presets has great potential in the so-called "big model era" in the future. The problem of model editing needs to be better explored, such as "in which parameters are model knowledge stored?", A series of very difficult questions such as "how the model editing operation does not affect the output of other modules" . On the other hand, to solve the problem of "outdated" models, in addition to allowing models to be "edited", there is another idea to allow models to "lifelong learn" and "forget" sensitive knowledge, whether it is model editing or model lifelong learning, Such research will make meaningful contributions to the security and privacy issues of LLMs.

Guess you like

Origin blog.csdn.net/xixiaoyaoww/article/details/130864619