Baidu: China's first large-scale model data labeling base lands in Haikou

According to Baidu official news , the Baidu Smart Cloud (Haikou) Artificial Intelligence Basic Data Industry Base, located in Xiuying District, Haikou City, has officially started operation. This is the first domestic large-scale model data labeling center jointly built by Baidu Smart Cloud and the Haikou Municipal Government .

Baidu Smart Cloud said that it has cooperated with local governments across the country to build more than ten data labeling bases, providing more than 11,000 stable jobs for the local area and indirectly driving 50,000 jobs.

In order to ensure the quality of data labeling, Baidu Smart Cloud has also built a full-process data service talent echelon. The Haikou data labeling base now has hundreds of full-time large model data labelers, and the undergraduate rate of the labelers has reached 100%.

" Different from the requirements of traditional data annotators, large model annotators require a bachelor's degree or above. I think the main reason is that large model data involves a wide range of knowledge, and the evaluation criteria are complex, which is a test of the annotator's language comprehension ability and logical reasoning. Ability . In the first two months of joining the company, the company will conduct collective training and assessment for us, and we will be formally employed after passing the assessment.” said Wang Jieyu, big model data labeler of Baidu Smart Cloud.

At present, large models are in the early stage of industrialization, and high-quality data is a key element for the industrialization of large models. For generative AI represented by ChatGPT and Wenxin Yiyan, massive data training, manual labeling, instruction fine-tuning, and reinforcement learning based on human feedback (RLHF) can continuously align large models with human values ​​and ways of thinking, making AI Larger models are more usable.

Guess you like

Origin www.oschina.net/news/255509