Advertising algorithm

background:

The title of this algorithm competition originated from a real advertising product in Tencent's social advertising business
- Lookalike. The purpose of this product is to find other groups similar to the target group from the massive group of
people . In the actual advertising business application scenario, Lookalike can
find potential consumers similar to existing consumers based on the advertiser's existing consumers, so as to effectively help
advertisers to tap new customers and expand their business. At present, Tencent's social advertising Lookalike similar crowd expansion products
are based on the first-party data and advertising effect data provided by advertisers (that is, the seed package crowd mentioned later)
, combined with Tencent's rich data labeling capabilities, through deep neural network. Network mining realizes the ability to simultaneously expand high-quality potential customers with similar characteristics for multiple advertisers online and in
real time .

Title:

Lookalike is based on a seed population (also known as a seed package) provided by the advertiser, and
automatically calculates a population that is similar to it (called an expanded population). This topic will provide contestants with hundreds of
seed groups, user characteristics corresponding to massive candidate groups, and advertising characteristics corresponding to seed groups. All
data are desensitized to ensure the safety and reliability of sensitive private data. The entire dataset is divided into training
set and test set. In the training set, users who belong to the seed package and users who do not belong to the seed package (that is, positive and
negative samples) in the crowd are calibrated. The test set will test whether the contestant's algorithm can accurately calibrate whether the users in the test set
belong to the corresponding seed package. The seed packages corresponding to the training set and the test set are exactly the same.

The data is divided into four parts:
training set data file, test set data file, user feature file and advertisement feature
file .

Each line of the training set data file train.csv represents a training sample, and the fields are separated by commas in the format
: "aid,uid,label". Among them, aid uniquely identifies an advertisement, and uid uniquely identifies a user.
The value of the sample label is +1 or -1, where +1 represents a seed user and -1 represents a non-seed user. To
simplify problem, a seed package corresponds to only one advertisement aid, and the two are in a one-to-one relationship.
Each line of the test set data file test.csv represents a training sample, and the fields are separated by commas in the format:
"aid,uid". The meaning of the fields is the same as that of the training set.

Each line of the user feature file userFeature.data represents the feature data of a user, the format is:
"uid|features", uid and features are separated by a vertical bar "|".

The format of the advertisement feature file adFeature.csv is: "aid,advertiserId,campaignId,creativeId
,creativeSize,adCategoryId,productId,productType". Among them, aid uniquely identifies
an advertisement, and the other fields are the characteristics of the advertisement, and the fields are separated by commas.

 

Evaluation method:

For the expanded similar users, if there is a relevant performance behavior (click or conversion) in the advertisement delivery,
it is considered a positive example; if there is no effect behavior, it is considered a negative example.
Each seed package to be evaluated will provide the following information: the advertising aid and its characteristics corresponding to the seed package, and the
corresponding candidate user set (uid and its characteristics). The contestant needs to calculate the user's
score in the test set for each seed package, the competition will calculate the AUC index of each seed package accordingly, AUCi represents the AUC value of the ith package,
and the average AUC of all the m seed packages to be evaluated as the final evaluation indicator.

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324774916&siteId=291194637