Baidu Search Innovation Competition, a technology carnival for 2,800 people

Introduction 

This is a technology competition with the theme of "innovation". In 4 months, 2,800 people participated in the competition, and the five major tracks fully opened up AI application scenarios. Students from 95 985/211 universities gathered here to learn, communicate, collide and grow. In this rapidly changing era, young people are actively colliding with cutting-edge technology and setting off a trend of AI innovation in their own way.

On December 1, the 2nd Baidu Search Innovation Competition with the theme of "New Search·Novelty" came to a successful conclusion. This competition has five major tracks, with topics involving semantic retrieval, multi-modality, soft-hard combination optimization and other search topics. core business scenarios, and encourages contestants to gain insight into search scenario needs and solve problems through AI and method innovation, which has attracted widespread attention from all walks of life. This article mainly introduces the five major track champions and their works.

The two-way rush between search and young inspiration

This competition lasted for four months and attracted more than 2,800 people from 45 provinces, cities and overseas cities to sign up. 81% of the contestants are college students, and more than half are graduate students. The organizing committee of the competition received more than 1,600 resume submissions . The submission directions were mainly in the direction of machine/deep learning and AI product innovation. The professional direction has a high correlation with the search for talents.

At the same time, the organizing committee organized nearly 20 online/offline training activities during the competition, with more than 50,000 students directly participating in relevant courses , and the event materials and courses reached 1 million developer groups. The holding of competition activities will provide assistance for talent cultivation and skill strengthening in the fields of retrieval and artificial intelligence, and further stimulate students' enthusiasm and motivation.

Participants in the competition went through fierce preliminary rounds, semi-finals, and finals. The review team comprehensively considered the technical content, innovation, application value and other dimensions, and finally 28 teams broke through and won the award. Xiao Yang, vice president of Baidu Group and head of search platform, said at the award ceremony: The wave of big language models has just begun, and the innovations it triggers will definitely accelerate the evolution of search engines. Through the Search Innovation Competition, we want to fully open up one of the largest AI application scenarios, search, so that the ingenuity of more young people can collide with search.

picture

1 How can search engines improve user satisfaction? Track 1 "Search Answer Organization" gave the answer

Competition question

How to use the generation model to organize the multiple search results returned by search engines for user queries, generate a correct, rich, semantically smooth answer that fully meets user needs, and improve the ultimate satisfaction of search engines?

A total of 719 people registered for this track and a total of 220 entries were submitted. The winner was a student team from the Institute of Computing Technology, Chinese Academy of Sciences. The team fine-tuned the LLM scheme by fine-tuning Lora, selected public question and answer data to enhance training, used large model distillation to improve the learning effect, and referenced NEFTune for noise embedding to increase model robustness. These initiatives make test data results more consistent with user needs. Reasonable motivations, solid analysis and credible conclusions are given for each technology selection, which reflects the team's in-depth understanding of the search for answers to organizational issues and its excellent scientific research capabilities.

picture

Li Yiming, the representative of the championship team, said in an interview: "'What you learn on paper is only shallow, but you must practice it.' Through this competition, I have only a little knowledge of the NLP field, and I have been able to debug and optimize a large model step by step, and a series of In the process, I experienced the joy of gaining knowledge and improving technology. Through offline communication and display opportunities such as the Baidu Search Innovation Competition, not only can we use the knowledge we have learned to truly solve some practical problems in the industry, but it also helps us Gain a deeper understanding of future career paths.”

In fact, this is also the original intention of the Baidu Search Innovation Competition. Let every contestant be able to recognize their own advantages in the fierce competition, form their own characteristic results in the practice and test of the competition, and establish future development plans.

2 Track 2 focuses on “TopK search based on vector intersection” and seeks innovation in classic problems

Competition question

Given the doc data set and query, find the TopK number of intersections between query and data in the complete doc set.

A total of 549 people registered for this track and a total of 113 entries were submitted. The champion is a stay-at-home dad. He graduated from Wuhan University and has temporarily left his job, but he has always been paying attention to the development of the industry. In this competition, his scores on both the machine test and the defense were far ahead. Its solution is based on the requirements of the competition problem. It not only achieves multi-threaded multi-stream parallelism and batch optimization, but also solves the problem of low GPU usage. It also proposes an efficient bitset solution to find the number of vector intersections, further improving the computing efficiency of the GPU. In addition, he also innovatively proposed an iterative solution for finding TopK based on thresholds, reducing the amount of calculation by narrowing the scope, and finally achieved a 23-fold performance improvement.

picture

Champion Chen Xi said in an interview: "The final performance improvement is not achieved overnight, but is obtained through the accumulation of very small optimizations. From an industry perspective, there are very few competitions oriented towards engineering optimization. Baidu Search It is really rare to be able to provide you with such a platform. At the critical moment when large AI models explode, search technology has also ushered in a turning point of innovation. The title of the competition outlines the development direction of search technology for us, let us work together Promote the prosperity and development of the industry.”

Participating in the competition is just an experience, but the series of issues involved and the results achieved are worth remembering.

3 Track 3 "Design an AI native application that solves the needs of search users", the inherent logic of technology creating value has gradually emerged

Competition question

Based on the contestants’ thorough research on search users, they gain insight into users’ needs in search scenarios, and combine AI capabilities to build AI applications that directly and effectively solve user pain points and needs.

A total of 530 people signed up for this track and a total of 83 entries were submitted. The championship team includes students from Nanjing University of Aeronautics and Astronautics, China University of Petroleum and other institutions, as well as social developers. From product managers, NLP master's degree graduates, prototype designers to front-end and back-end engineers, they are all compound talents with one specialty and multiple abilities. The team's work is the "AI Resume Assistant", which performs well in mining and understanding user needs , and meets the full-link needs of candidates in recruitment scenarios. At the same time, the strong execution ability of the team ensures the final effect, which is eye-catching. During the defense process, the results of thinking, innovation, implementation, and evaluation were fully demonstrated and won unanimous praise from the judges.

picture

Li Kechen, the representative of the champion team, said in an interview: "Through this competition, we have more specific goals and directions for our future career planning. During the competition, we gained an in-depth understanding of the cutting-edge development of artificial intelligence through product research. At the same time, Experiencing the Baidu Lingjing platform has given us a deeper understanding of the practical applications of AI and LLM, and has also given us a strong interest in research and development work in this field. In the future, we will continue to conduct in-depth and extensive I hope to continue my studies and grow in the fields of machine learning, data science or algorithm development, and I also hope to have the opportunity to join Baidu Search."

Using the thinking and technology of contestants to generate new ideas, promote technological innovation, and constantly update and improve search methods and technologies to better adapt to the needs of users and society. This is the proposition of the Baidu Search Innovation Competition, and it is also Baidu search is committed to exploring the direction.

4 Track 4 "GPU-based approximate nearest neighbor search algorithm challenge" to improve the efficiency and accuracy of the search algorithm

Competition question

Given a billion-level data set and a test set, contestants design their own approximate nearest neighbor search algorithm to return the topK samples most similar to each query in the data set. Provides a unified virtual environment and benchmark framework, and uses QPS-recall as the only evaluation indicator of the algorithm.

A total of 273 people registered for this track and a total of 30 entries were submitted. The championship team comes from the Knowledge Graph Laboratory of Hangzhou University of Electronic Science and Technology of China. Team members have won several awards in major programming competitions at home and abroad, and have also published several articles on vector retrieval as co-authors in top international database conferences, such as VLDB and NeurIps. During the competition, the team optimized the algorithm through pipeline technology and achieved a score of 1.5 times the baseline in the early stages of the competition, ranking at the top of the rankings, but they did not stop there. In order to secure the championship, they continued to explore the limits of the algorithm, and finally discovered the bandwidth bottleneck of the algorithm in the middle and late stages of the competition. Finally, they further doubled its performance through model index compression, reaching 3 times the baseline score, and won the The champion of this track.

picture

The champion representative said in an interview: "The competition is an experience, and experience brings rewards. Through this Baidu Search Competition, we not only improved our teamwork skills, but also exercised our spirit of never giving up."

Of course, this is also one of the purposes of the Baidu Search Innovation Competition, to provide opportunities and platforms for every young person with ideas.

5 AI can create works of art that match your mood? Challenge the "Controllable Image Generation Algorithm" on Track 5!

Competition question

With the Vincentian graph task as the core and based on the diffusion technology framework, it optimizes its own generation model through training methods and prompt engineering.

A total of 390 people registered for this track and a total of 50 entries were submitted. The championship team comes from Beijing Institute of Technology, and its members are mainly composed of two doctors and three masters. Their goal is to fully understand user needs and produce images that are relevant, beautiful, clear and innovative. The team used multiple methods to implement the algorithm: The first was to mine and crawl its own large-scale data sets through large-scale data collection, cleaning, annotation, alignment and enhancement, and conduct large-scale cleaning of Baidu's official data sets. On this basis, multiple LORA models were mixed and used, and preliminary results were obtained. In addition, it reuses its own collected data for training and integrates it with multiple LORA models. After constant exploration and trial, we controlled the variables and carefully examined the reasons, and finally won the first place. The effect was improved by 5 times compared with the basic model.

picture

The champion representative said in an interview: "Through the competition, we deeply felt the importance of teamwork. Through continuous brainstorming, we can gradually achieve the effect of 1+1 greater than 2. At the same time, we also realized that although there are many open source The model can be used, but the thinking should be innovative and cannot stop there. We should be realistic, pragmatic, and down-to-earth, and implement our own algorithms step by step to achieve our goals. Change happens all the time. Just like the theme of this competition' New Search Novelty', which emphasizes the continuous development and innovation of search technology, also represents the continuous changes in people's needs and methods for search.

Achieve excellence with outstanding AI talents

The Baidu Search Innovation Competition is a professional search competition with the largest coverage, the widest influence, the most achievements, and the highest standards in China. It is known as the "Olympics of the search industry", but it is more than just a competition. The competition is a starting point. We hope to find the best AI innovation, embrace inspiration with young people, and pursue their ideals together. The competition is a platform. We hope to collide with the ideas of young talents and innovation teams across fields and disciplines, and inject new ideas into the technological gene. vitality. In this process, we will strengthen the evangelism of search product technology, strengthen the tracking and support of outstanding entries, and broaden channels and provide assistance for the transformation of innovative results.

The trend of AI innovation has been set off. Achieving excellence together with outstanding talents is the value of the competition.

——END——

Recommended reading

Uncovering the Mystery of the Event Loop

Baidu search display service reconstruction: progress and optimization

Optimization practice of Baidu APP iOS package size 50M (7) Compiler optimization

Baidu search content HTAP table storage system

In the era of big models, what does the Baidu developer platform that “everyone can do AI” look like?

Tang Xiaoou, founder of SenseTime, passed away at the age of 55. In 2023, PHP stagnated . Hongmeng system is about to become independent, and many universities have set up "Hongmeng classes". The PC version of Quark Browser has started internal testing. ByteDance was "banned" by OpenAI. Zhihuijun's startup company refinanced, with an amount of over 600 million yuan, and a pre-money valuation of 3.5 billion yuan. AI code assistants are so popular that they can't even compete in the programming language rankings . Mate 60 Pro's 5G modem and radio frequency technology are far ahead No Star, No Fix MariaDB spins off SkySQL and forms as independent company
{{o.name}}
{{m.name}}

Guess you like

Origin my.oschina.net/u/4939618/blog/10322674