Introduction to CTR Estimation Models in Targeted Display Advertising

There are some fields on the platform that are reserved for advertising. For the platform, it is necessary to pay attention to the overall benefit, users need to pay attention to the user experience, and advertisers need to accurately place advertisements to the audience to improve the conversion rate.

When ads are displayed to users, the most important thing is sorting. The previous sorting formula is ctr*Bid , where ctr is the historical click-through rate of the advertisement, and Bid is the advertiser's bid. The disadvantage is that there is a cold start of advertisements and the lack of personalized demands of users; for newly launched advertisements, the delivery volume is too small and the data is biased. You can use a smooth click rate method to set a default display and click rate before the advertisement is placed. Add a constant, (C + alpha) / (I + alpha + beta) to each of the display and click .

For the ranking of display advertisements, the formula used by industry advertisements is pCTR*Bid , where pCTR is the estimated click-through rate,

pCTR:p(click|ad,user) , train the estimated CTR model by logistic regression.



where 1 is click and 0 is no click

 

The following figure is the estimated click-through rate model for targeted advertising


The model candidate features for click-through rate estimation mainly include the features of advertisements and the features associated with users; the features of advertisements include the quality of advertisements, historical click-through rate, popularity, etc., and the user's hobbies for advertisement features are regarded as the features associated with advertisements and users. Introducing associated features addresses the cold-start and sparsity problems of advertisements. The text description of the advertisement is used as the feature representation of the advertisement to calculate the user's preference for each advertisement feature. There are problems of high dimension and large amount of calculation. Therefore, the theme of the advertisement is extracted, and the user's preference for the advertisement theme is used as the click-through rate estimation p Feature representation of (c|u,ad) .

Using the advertisement clicked by the user on the platform, combined with the user's portrait label u(f1,f2,f3 ,,,, ) , the user's topic preference model is trained. The training of the model adopts the binary logistic regression algorithm to obtain the topic preference model p( topic|u) , and then combined with the user's portrait tag to calculate the user's hobbies for each topic. User labels have thousands of dimensions. In order to solve overfitting and reduce the complexity of the model, it is necessary to filter and reduce the dimension of the user's label features, and select the label features of the TOPN that contribute the most to the click PV .

After selecting the candidate features, the follow-up focus is on the selection of samples and the preprocessing of features, which are introduced in a separate chapter.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324691217&siteId=291194637