The Heart of the Machine Editors: Du Wei, Dan Jiang
Compared with LLaMA, there are many Chinese scholars in the development of Llama 2.
Recently, the open source of Llama 2 made Yann LeCun and more people in the industry directly say that "the pattern of large models has undergone tremendous changes."
In addition to open source, Meta also announced that Llama 2 is free for commercial use! Llama 2 provides 7 billion, 13 billion and 70 billion parameter versions, and the training data is 40% more than Llama 1, reaching 2 trillion tokens. The fine-tuned Chat model is trained on 1 million human labeled data.
From the results, Llama 2 is superior to other open source language models in many external benchmarks including reasoning, coding, proficiency and knowledge testing, and can even approach GPT-3.5 on some data sets, and it can also be equal and beyond Google's PaLM (540B).
While lamenting Meta's commitment to open source efforts, eyes are turned to the Llama 2 development team. It can be seen that some core authors of LLaMA, such as Gautier Izacard, Armand Joulin, Edouard Grave, Guillaume Lample, Timothee Lacroix, etc., have disappeared in the development of Llama 2.
Llama 2 Technical Report: https://ai.meta.com/research/publications/llama-2-open-foundation-and-fine-tuned-chat-models/
In addition to the changes in the core authors, there are more than 10 Chinese scholars among the nearly 70 authors in Llama 2.
The Heart of the Machine sorts out the following Chinese scholars who participated in the research and development of Llama 2. If there are mistakes and omissions, please correct them in the comment area.
Moya Chen
Moya Chen is a Meta large language model (LLM) research engineer and temporarily left in July. Since joining in 2015, her work includes Platform/Business Reputation, Computational Camera (CV, AR), WorldXR (CV, AR, and XR), FAIR Labs (chatbots), and FAIR/GenAI (LLM).
She holds a BA in Computer Science from Caltech.
Jeremy Fu
Jeremy Fu is now a research engineer at FAIR, and his main direction is large language models. He previously worked on content understanding and user modeling at Instagram Machine Learning. Will be working full-time at Meta since January 2021.
He holds a Bachelor of Computer Science and Business from the University of New South Wales, Sydney.
Wenyin Fu
Wenyin Fu is now Meta Data Center ML Performance Engineer, mainly engaged in large-scale design and optimization of ML platform solution deployment, and evaluating data center hardware solutions for optimal capacity ROI. He joined Meta in May 2019 and previously worked at Nvidia, AMD and Intel.
He graduated from Shanghai Jiaotong University with a bachelor's degree in electrical and electronic engineering, and a Ph.D. degree in electrical and computer engineering from the University of Wisconsin-Madison.
Cynthia Gao
Cynthia Gao is currently the project manager of the Meta product data operation department, mainly engaged in manual data annotation and collection projects for machine translation and generative AI large models. Previously worked in various departments including FAIR.
She has studied at Beijing Normal University, UC Davis (BA, Psychology and Chinese Language and Culture) and Monterey Institute of International Studies (MA, Translation and Localization Management).
Rui Hou
Rui Hou is now a research scientist at Meta GenAI, focusing on generative AI technology and related production applications. He joined Meta in April 2020, having previously interned at organizations such as the Toyota Research Institute.
He graduated from Tongji University with a bachelor's degree, a master's degree (intelligent system and computer science double degree) and a doctorate (intelligent system) from the University of Michigan.
Google Scholar: https://scholar.google.com/citations?user=PKHKqX0AAAAJ&hl=en
Yinghai Lu
Yinghai Lu is now the chief software engineer of Meta, the head of the AI inference technology of the Meta infra group, and is currently engaged in the deployment of generative AI inference. He joined Meta in 2016 and has led the deployment of GPU inference for Ads and Reels recommendation models.
He graduated from Tongji University majoring in Electrical Engineering and Ph.D. in Electrical Engineering from Fudan University.
Google Scholar: https://scholar.google.com/citations?user=prBXsm8AAAAJ&hl=zh-CN
Yuning Mao
Yuning Mao is currently a Meta GenAI Research Scientist. He graduated from the IEEE Honors Class of Shanghai Jiaotong University with a bachelor's degree and a Ph.D. degree in Computer Science from the University of Illinois at Urbana-Champaign under the supervision of Professor Jiawei Han.
The goal of his research is to help people access information and knowledge more effectively and efficiently. To achieve this goal, he has been working on a wide range of research topics such as text summarization and generation, question answering, parameter efficient fine-tuning, and taxonomy construction. Most recently, he has been involved in the development of the Meta LLaMA model series, especially with respect to safety for large models.
Personal homepage: https://morningmoni.github.io/
Yixin Nie
Yixin Nie is now a research scientist at Meta AI. He graduated from China University of Geosciences with a bachelor's degree, a master's degree from the University of Chicago, and a Ph.D. degree in computer science from the University of North Carolina at Chapel Hill.
His work focuses on machine learning and natural language processing, and his research interests stem from the idea of machine natural language acquisition.
Personal homepage: https://easonnie.github.io/
Xiaoqing Ellen Tan
Xiaoqing Ellen Tan is currently a data science researcher at Meta AI. She received her BS in Pharmacy and Computer Science from Sun Yat-Sen University in 2018, became a visiting computer science student at Carnegie Mellon University in 2019-2021, and received her PhD in Biostatistics from the University of Pittsburgh in 2022.
Her research interests lie in the development of novel statistical and machine learning methods in the areas of causal inference, data integration, and decision fairness.
Personal homepage: https://ellenxtan.github.io/
Puxin Xu
Puxin Xu is now a senior data engineer at Meta AI, mainly working on multimodal datasets (text, image and video) and large model pre-training data. He received his undergraduate degree (double major in human resources and urban and rural planning management, statistics) from Sun Yat-sen University and his master degree (industrial and systems engineering) from Lehigh University.
Zheng Yan
Zheng Yan is now a software engineer at Meta, using AI to solve problems encountered by the account access team. Previously worked as a data analyst at the Sean N. Parker Center for Allergy & Asthma Research at Stanford University. He holds an undergraduate degree in Computer Science from Stanford University.
Yuchen Zhang
Yuchen Zhang is now a software engineer (machine learning)/research engineer at Meta AI, working on the training and scaling of large models (language/multimodal), and the research of responsible AI in large models. She holds a bachelor's degree from Emory University and a master's degree in engineering and data science from the University of Pennsylvania.
Personal homepage: https://zycalice.github.io/
Angela Fan
Angela Fan is a research scientist at Meta AI Research Paris, focusing on machine translation. Previously, she studied text generation for her Ph.D. at INRIA in Nancy and FAIR in Paris. Before that, she was a research engineer and earned a BA in Statistics from Harvard University.
Personal homepage: https://ai.meta.com/people/angela-fan/
Reference link:
https://www.36kr.com/p/2176578148315396
Pay attention to the official account [Machine Learning and AI Generation Creation], more exciting things are waiting for you to read
In-depth explanation of ControlNet, a controllable AIGC painting generation algorithm!
Classic GAN has to read: StyleGAN
Click me to view GAN's series albums~!
A cup of milk tea, become the frontier of AIGC+CV vision!
The latest and most complete 100 summary! Generate Diffusion Models Diffusion Models
ECCV2022 | Summary of some papers on generating confrontation network GAN
CVPR 2022 | 25+ directions, the latest 50 GAN papers
ICCV 2021 | Summary of GAN papers on 35 topics
Over 110 articles! CVPR 2021 most complete GAN paper combing
Over 100 articles! CVPR 2020 most complete GAN paper combing
Dismantling the new GAN: decoupling representation MixNMatch
StarGAN Version 2: Multi-Domain Diversity Image Generation
Attached download | Chinese version of "Explainable Machine Learning"
Attached download | "TensorFlow 2.0 Deep Learning Algorithms in Practice"
Attached download | "Mathematical Methods in Computer Vision" share
"A review of surface defect detection methods based on deep learning"
A Survey of Zero-Shot Image Classification: A Decade of Progress
"A Survey of Few-Shot Learning Based on Deep Neural Networks"
"Book of Rites·Xue Ji" has a saying: "Learning alone without friends is lonely and ignorant."
Click on a cup of milk tea and become the frontier waver of AIGC+CV vision! , join the planet of AI-generated creation and computer vision knowledge!