For big data, algorithm projects are very popular in any large company, whether it is interview or actual combat, it is a technology that must be used. The editor selected more than 50 first-tier manufacturers, including (Ali, Baidu, Tencent, Byte, Meituan) and so on. Sum up this 987-page collection of core big data and algorithm experience of first-line manufacturers!
Don't just store it up and eat ashes! Be sure to swipe it when you have time! Page 978 I wish you a promotion and a salary increase! This document summarizes the content of more than 50 first-line manufacturers, so I won't show them all. Friends who need to obtain this pdf can directly forward it + follow the private message (learning) and get it for free!
Dachang Algorithm
Big data
Semantic Understanding Technology and Application Based on Knowledge Graph
Many challenges in multiple text forms and business scenarios
Baidu Chinese Error Correction Technology
1. An overview of magic and error
Language is complicated. Each language has gone through hundreds or even thousands of years of long-term evolution and development, forming a complex set of grammar and syntax rules. These grammatical and syntactic rules are complex and changeable. For example, some words or phrases have multiple sounds, multiple meanings, and multiple uses, which place higher requirements on language users; once the language users do not have enough grasp of the language or are careless, It is very easy to make mistakes such as improper use of words and arrogance. Although these errors may seem trivial, it is said that "the smallest difference is a thousand miles away", especially in certain scenarios (such as diplomatic occasions), even a small language error can have a very bad impact.
Common tasks in natural language processing include lexical analysis, syntactic analysis, semantic calculation, etc. For these tasks, to achieve ideal results, the accuracy of input data is the basic premise. Therefore, from the overall technical perspective of NLP, text error correction plays a role The role of escort.
·Project Objectives
- -Multiple types of coverage, multiple types of errors, typos, too many words, few words, disorder, etc.
- -Multi-modality-support text, voice and other different input forms to correct errors
- -Scene migration-fast, flexible, configurable deep customization
Tencent Information Flow Content Understanding Technology Practice
Project Background
1 Evolution of content understanding technology
① Portal era: 1995~2002, mainly representing companies: Yahoo, Netease, Sohu, Tencent. In the early days of the Internet, because there was less data, a place where content was aggregated was needed so that people could find information quickly. Therefore, the portal organizes content through "content types" and then meets user needs in the form of channel pages. Because of the lack of data, the news was sorted manually in the initial stage. With the increase of data, manual classification has become unrealistic, so major companies have introduced classification technology to automate text classification. Since then, text classification technology has developed rapidly.
RALM: Application of Real-time Look-alike Algorithm in WeChat Take a Look
Introduction: This sharing is on WeChat—see a paper published by the team on KDD2019. The long tail problem is a classic problem in recommender systems, but the current popular click-through rate estimation methods cannot fundamentally solve this problem. Based on the look-alike method, the article designs a set of real-time look-alike framework for the application scenarios of WeChat look-alike, which not only solves the long tail problem but also meets the high timeliness requirements of information recommendation.
Core demand
│Real time
· New item distribution without retraining the model· Real-time completion of seed user expansion
│Efficient
·Strengthen the distribution of long-tail content while maintaining CTR·Learn more accurate and diverse user expressions
│Quick
· Streamline prediction calculation · Meet online time-consuming performance requirements
The practice of advertising algorithms in the growth of Ali entertainment users
Guide: Starting in 2019, Youku has used DSP to place video ads on platforms such as Toutiao and Alimama to achieve a steady growth of users. We combine the user growth field with the advertising bidding field, learn from the practice in the recommendation field, and develop a series of algorithms based on our unique business background. Under the controllable cost and budget, the drainage capacity of millions of DAU was finally realized. This article mainly introduces the design and optimization of foreign investment advertising algorithm in the field of user growth, and solves the problem of maximizing DAU under the condition of constraints.
The following will expand around four points:
- Youku User Growth Business Introduction
- Advertising ranking algorithm and optimization
- Automated quotation algorithm
- Summary and follow-up planning
Application of Content Understanding in Sina Weibo Advertising
Introduction: People who do algorithms often say that "data is king", while for people who advertise, content understanding is the basis of advertising. This sharing will introduce the role of content understanding in Weibo advertising. The main contents include:
- Introduction to Commercialization of Advertising System and Weibo Content
- Problems caused by insufficient content understanding
- Build content understanding and specific business applications
Long-term interest modeling in Alimama's click-through rate estimation
Ali CTR estimates the progress in dynamic style modeling and feature expression learning
This document summarizes the content of more than 50 first-line manufacturers, so I won't show them all. Friends who need to obtain this pdf can directly forward it + follow the private message (learning) and get it for free!
JD e-commerce recommendation system practice
This document summarizes the content of more than 50 top-tier manufacturers (Ali, JD, Baidu, Tencent, Meituan), etc.! So I won’t show it all to everyone. Friends who need to get this pdf can directly forward it + follow the private message (learning) and get it for free!