Algorithm masters gather again to clear the cheats and take them away, thank you

Recently , registration for the second Tencent Advertising Algorithm Competition was launched. So far, applications from more than 20 countries and regions including China, the United States, Europe, and Australia have been received. Many algorithm experts have joined this brain-burning competition. In order to allow more contestants to perform better in future competitions, Tencent Social Ads has brought you a detailed analysis of this competition's questions and the "Entry Guide" of the previous winners.


During the competition, Tencent Social Advertising will continue to bring you more experience and experience of the contestants and the dry goods taught by technical experts. For more exciting content, please pay attention to the official subscription platform of Tencent Advertising Algorithm Competition (ID: TSA-Contest).


640?wx_fmt=png&wxfrom=5&wx_lazy=1


First-hand analysis of competition questions

Following the topic of the first university algorithm competition, which was based on the estimation of the conversion rate of mobile apps, Lookalike has become a new topic to help the industry improve the overall advertising efficiency. The working mechanism of Lookalike is to find similar users .


640?wx_fmt=png

Figure 1: Find similar expansion groups based on seed users


For a long time, the difficulty of finding high-potential users, and the difficulty in finding a balance between precision and scale are the two main problems faced by advertisers. The core is still to effectively reach large-scale potential users. As early as 2013, Tencent Social Advertising began to investigate and explore Lookalike technology, and designed to find similar groups based on seed user portraits and relationship chains, that is, automatic expansion based on the common attributes of seed groups to expand the coverage of potential users and improve advertising effects.


Although Lookalike technology has been developed for many years and has achieved good results, it still needs more cutting-edge technologies to invest in it. In this competition, Tencent Social Ads convened algorithm experts, and used the simulated data package in similar crowd expansion products as competition data, abstracting the task of finding similar crowds into a machine learning problem.


Specifically, the seed package contains a batch of known seed users submitted by advertisers, which can be used as positive samples in machine learning problems. There are a large number of non-seed users in the advertising platform, as well as a large amount of historical advertising data, which can help generate negative samples. With positive and negative samples, the expansion of similar populations becomes a binary classification problem in a machine learning problem. In online use, the advertising platform can determine the degree of similarity between candidate users and users in the seed package based on the posterior probability P(y|x) calculated by the binary classification model, and finally use the batch of candidate users with the highest similarity as the final result .


In this binary classification problem, the model algorithm and model features become the two most important factors for achieving good results. Common binary classification algorithm models, such as SVM, FM, GBDT, LR, NN, etc., are worth trying.


This competition is divided into three stages: preliminary round, semi-finals and defense. The data scale of the semi-finals will be larger than that of the preliminary rounds. Therefore, when players use the existing binary classification algorithm, they need to consider the computational complexity of the algorithm, and may need to re-develop the published algorithm to meet the computational performance requirements. Another point worth noting is that players need to do a lot of articles on user feature engineering: in each step of feature engineering such as data cleaning, feature screening, and building new features, players need to come up with the most relevant operations to the Lookalike problem. Good results can be achieved.


Customs clearance experience here

After the title of the competition is clarified, the players should avoid stepping on the "pit" while moving forward bravely. Although the topics of the two competitions are different, the experience of participating in the competition can be exchanged. Here are a few more customs clearance eggs: the first contestant who got the offer from the goose factory showed up and taught the customs clearance experience secretly.


640?wx_fmt=png

Zhang Jianmin, winner of the first algorithm competition


The first Easter egg came from Zhang Jianmin, a talented woman from Peking University who was in the last "Will the Team". Her team not only won the fourth place in the total score of the competition, but also won the Best Performance Award in the defense. From the perspective of the self-improvement of the competition, she wrote an article " Past Players Share Experience: How to Improve Yourself in the Competition? ", shared the successful experience of the competition. On the basis of in-depth understanding of business logic, "Whether the team is right" looks for key data and features, analyzes the advantages of different models for data processing, so as to improve the models they use, and integrates multiple models step by step to improve results. Zhang Jianmin suggested that big data processing should be considered clearly when designing the entire processing method and process to ensure the processing speed after changing the data in the final stage.


640?wx_fmt=png

Li Qiang, winner of the first algorithm competition


The second easter egg came from Li Qiang, the algorithm god of Dalian University of Technology, who was the runner-up of the last runner-up "Raymone" team. He is already a prospective employee of Tencent Social Advertising and will soon join the Goose Factory.


For this competition, Li Qiang specially compiled the " Tencent Advertising Algorithm Competition New Guide " for newcomers, sharing some problems and solutions encountered in the competition, covering data set division, feature engineering, data scale, Heavyweight dry goods like model selection and fusion. Quietly focus: Li Qiang's guide also mentioned that in the competition, it is necessary to clarify the logic behind the actual business, do more homework, and learn from the old drivers!


The ancients said that there must be a teacher in a three-way trip. The customs clearance experience of the team "Is it right?" and the "Raymone" team told us that we are good at learning, consulting materials, asking experienced friends for advice, and communicating with other players. progress. Looking forward to this year's players to play their fine traditions and achieve great results!


Click "  Read the original text  " to view the Tencent Algorithm Competition;

↓↓↓

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325984491&siteId=291194637