[Notes] Big model, big data

Large model, large data, loss will be reduced and accuracy will be increased

1 large model

1.1 Model's Aha Moment

 Give a half-knowledgeable example

1.2 Model

chain of thought

Only when the model is large enough will it have a better effect 

calibration

The confidence of the detection model for the answer

"u-shape" will appear

 2. Big data

Grammar and world understanding require different amounts of data

 2.1 Preprocessing of data

Repeat training data to avoid hard memorization of models

2.2 Fixed Computing Resources

 

Given a certain resource, the reasonable ratio of data volume and parameter volume is given.

Small model with big data  beats big model with small data

[LLaMA] also refers to this idea

2.3 Model Adjustment

2.3.1 instruction-tuning

For finetune corresponding to the problem

2.3.2 Overall Architecture

pretrained -> finetune ->reinenforce learning

 

 (1) The finetune effect of the small model will be better than that of the large model

 (2) The small model reinenforce learning effect will be better than the large model

3. Jump out of "big model and big data"

3.1 knn lm

General

as a classification problem

KNNLM

(1) Find the target and source vectors

(2) Find the distance between the target and the source

 Red box: use with conventional method (weighted)

3.1.1 Disadvantages

inference too long

 3.2 RETRO

Avoid model memory by querying (such as the value of Π)

Guess you like

Origin blog.csdn.net/weixin_50862344/article/details/130083043