Meng Wei, Chairman of LF AI&Data Foundation: Open source and commercialization of large models are still in a fuzzy zone

8df0ee978d6829046abfb775388742b8.jpeg

ca85f690326b03dbafaddf21de5aa25f.jpeg

[CSDN editor's note] The accelerated development of large models has brought more common problems to the surface. How should the commercialization of open source large models be taken? What kind of open source protocol can reach a consensus that will help move it forward? Meng Wei, who has been working in the field of standards and open source for more than ten years and now shoulders the important responsibility of chairman of the board of directors of LF AI & Data, is very eager to help everyone find the answer.

Interviewee: Meng Wei, Director of Open Source Strategy, ZTE Corporation, Chairman of the Board of Directors of LF AI & Data

12db80c5bad691b6d0ce6c16a929d2a8.jpeg

Mr. Meng Wei, Director of Open Source Strategy at ZTE Corporation. Since 2016, he has led ZTE’s pre-research work in the field of artificial intelligence. He is committed to research in the direction of AI and 5G, and was elected as the Chairman of the United Nations International Telecommunication Union ITU-T ML5G WG3 (Machine Learning Applied to 5G Network Architecture Group) in 2018. At the same time, he was elected deputy leader of the overall group of the China Artificial Intelligence Industry Development Alliance, member of the board of directors of the Linux AI & Data Foundation, and was successfully elected as chairman of the board of directors of the Linux AI & Data Foundation in 2023. Mr. Meng Wei has served as the mentor of the PTL and Adlik projects of the Linux Foundation's ODL project, and has established and published a number of international standards in IETF and ITU-T, involving artificial intelligence and network function virtualization. At the same time, he has obtained more than 30 Chinese and international patent authorizations as the first author.

In the year he was elected as the chairman of the LF AI&Data Foundation, Meng Wei has been working in the fields of standards and open source for more than ten years.

In March 2018, the Linux Foundation established a sub-foundation, the LF AI & Data Foundation (formerly the Deep Learning Fund). As one of its ten founding members, ZTE has spared no effort in promoting the development of the AI ​​ecosystem. In 2019, ZTE incubated the inference-side tool chain open source project Adlik in the foundation, attracting dozens of domestic and foreign companies to participate in community ecological construction.

Meng Wei’s relationship with open source was even earlier. When he was in college, he began to try to use Linux servers and open source tools to build private VPN and NAS systems. Since 2015, he has officially contributed to the community and completed the process from using open source to contributing to open source. role change. Now he calls himself a "volunteer of the foundation", focusing on issues such as open source and commercialization in the field of AI large models, building an AI ecosystem and helping more developers better embrace open source.

In this exclusive interview, Meng Wei, the new chairman of the board of directors of the LF AI & Data Foundation, is invited to share his story of growing up with open source and AI.

0dc78943efc8bfb4a88fdba3cab995d6.png

CSDN: You were elected chairman of the LF AI&Data Foundation in June this year. What interesting stories can you share during the campaign?

Meng Wei: On the board of directors of LF AI & Data, the proportion of Chinese people is higher than that of the boards of directors of other global foundations, with half of them being Chinese and foreign. The campaign process may not have been interesting, and everyone was quiet throughout the voting process. All votes are anonymous, and only the announcement of the final results can be seen. The election process is relatively clear. First, the secretary of the foundation will announce the nomination process. One week after the deadline for nominations, a completely anonymous vote begins. Voters must be sent directly to the secretary's mailbox, so nominators do not know who voted for or against.

The voting process lasts about a week, and the votes are tallied and announced one or two days after the voting closes. In addition, the election must follow one principle: the number of voters must exceed half of the total number of votes for the voting results to be valid. If less than half of the people vote, it means that everyone is not enthusiastic about this matter, so the vote is invalid. At that time, the staff of the foundation and I actively maintained communication with the members of the board of directors, allowing everyone to participate in the voting activities. I understand that the role of the chairman of the foundation is more like a volunteer. I am willing to serve everyone and the open source community.

CSDN: What is the founding background, main work and goals of the LF AI & Data Foundation?

Meng Wei: In the second half of 2017, the Linux Foundation has made relevant proposals and is preparing to establish an artificial intelligence sub-foundation. At that time, artificial intelligence was very popular, and many artificial intelligence organizations appeared at home and abroad, such as industry alliances and standardization organizations. From the end of 2017 to the beginning of 2018, these organizations developed rapidly. When the first wave of artificial intelligence emerged, people paid attention to deep learning and neural networks, so the Linux Foundation proposed to establish an artificial intelligence-related sub-foundation - the Deep Learning Foundation, hoping to leverage the power of deep learning to make artificial intelligence open source The field took root and led to the development of the industry. This is the predecessor of LF AI & Data.

When the Deep Learning Foundation was first established, there were less than 10 projects. Adlik donated by ZTE in 2019 was the sixth project of the foundation. Now the foundation has nearly 70 projects, and its members have also increased from more than 10 at the beginning. 46 so far. The mission of LF AI & Data is to establish and support an open source community for open source artificial intelligence and data, provide members with new opportunities for collaboration and creation, and promote innovation and industry implementation in the fields of artificial intelligence and data. As the new chairman of the foundation, I very much hope that the number of members of the foundation will grow rapidly, have enough influence in the industry, and promote the implementation of artificial intelligence, including the currently very popular large models, in the industry.

While participating in the board of directors work of LF AI & Data and the work of the TAC committee, I was deeply impressed by the internationally renowned foundation's code of conduct and guidelines. Take a meeting as an example. At each meeting, the chairman of the meeting will review the minutes of the last meeting. However, he will not directly ask everyone to raise their hands to express approval or objection. Instead, he will ask at least two voters to initiate the voting motion. , namely first motion and second motion. If no one expresses approval, the vote will be abandoned. Initiating motions in advance can avoid invalid voting on topics and make the meeting more efficient. This is a solemn and ceremonial code of conduct.

5f0d926644345676cc842a0d8e5af2a0.png

CSDN: How did you get involved with open source and embark on the path of open source? 

Meng Wei: The first time I came into contact with Linux and used it was when I was in college. The science and engineering students in the same dormitory were all geeks, and they studied Linux systems and various open source tools in the dormitory. We built Linux servers mainly to build private VPN and NAS systems, and we could access the data on the hard disk in the dormitory in the Internet cafe, which was very cool at the time. Real participation in community contributions and open source community activities began in 2015 in the OpenDaylight (Linux Foundation's open source network project) community under the Linux Foundation. This project is called the next-generation network operating system and is mainly used for SDN controllers.

The first open source project we contributed to the community was OF-CONFIG. As the respondent reviewer of the project entering the community, after the successful incubation of the project, I have been leading the development of the project as the PTL (Project Team Leader) of the project, including the iteration of subsequent versions. OF-CONFIG has now graduated in the OpenDaylight community, and OpenDaylight has been widely used in the network deployment and operation industry.

CSDN: In the process of becoming a professional open source player, what has had a greater impact on you?

Meng Wei: The issue of closed source and open source is always discussed heatedly. I think there is no clear boundary between open source and closed source companies. One of the things that impressed me the most happened at an open source summit in 2016. The traditional closed source operating system "Big Brother" Microsoft was the top sponsor of this conference, and the slogan was printed on T-shirts and badges: Microsoft Love Linux ! I was very surprised when I participated in the meeting. Later, I found out that this is the charm of open source, which brings together developers of different colors, races and industries in the world to speak freely in the community. At many open source conferences, participants pass by wearing slippers. It is a very open and casual occasion, and people wearing suits and ties are out of place. Attendees with long hair, braids, and shorts, board shorts, and flip-flops tend to look like bullies. An open source culture brings talent, developers, and geniuses from around the world together to maximize value. This is a manifestation of the vitality of open source, and it is also one of the reasons why I remain enthusiastic about open source.

CSDN: When did ZTE start making open source related contributions? What challenges have you encountered in your work on standards and open source strategy? 

Meng Wei: ZTE was founded in 1985, and its early mobile phones were very well-known in China. In addition to Linux, ZTE has actively joined the Android development community and contributed its own codes when Android first started. At that time, everyone will not deliberately emphasize open source, but will actively participate in it.

The open source group of the ZTE Standard Strategy Committee has dozens of people, including legal affairs, security, compliance, etc. It is not called OSPO, but it performs similar duties. ZTE believes that open source is a de facto standard, so we put traditional standards and de facto standards (open source) under the same strategic system. The open source group needs to systematically identify competitive open source projects within the company, as well as projects that hope to donate, formulate a community-based operation plan, and promote the top-level design of the open source ecosystem.

There are indeed some difficulties. First of all, as a communications equipment manufacturer, ZTE’s open source projects in the communications field are not as large in quantity and quality as those in the operating system and database fields, but they still have a certain scale. Secondly, ZTE is expanding its IT capabilities and conducting extensive research and development on databases, operating systems and chip designs. However, many teams originally focused on the network field, and their thinking may still be in the development of customized network software, and the understanding of open source varies greatly between teams. Therefore, an important task in the future is to conduct open source evangelism within the company to form a A more open open source culture increases developers' awareness and enthusiasm for open source.

1662fd7e497107222cb0ba6c69ae3bad.png

CSDN: AI has different degrees of open source in different fields. The open source of the framework and vision sector is relatively early, and the development of large models is relatively slow. What is the reason behind this? 

Meng Wei: There are many directions for AI open source. AI framework is a very important field, and there are many excellent projects at home and abroad. There are so many frameworks now, and it takes a long time for developers to adapt to the new frameworks. It would be better if there are more tools that can shield the underlying frameworks without paying attention to the underlying frameworks.

LF AI & Data hopes that more excellent AI tools will emerge to help enterprises quickly carry out digital transformation and help developers use them better. Implement project version release and function implementation through AI tools. The open source of large models has gradually developed with the popularity of ChatGPT. I have participated in several industry forums recently, and when discussing whether open source large models are needed, my answer is always yes. When closed-source large models become the industry benchmark, competitors below will open source them one after another, and open source large models will become a trend. 

CSDN: What points are you more concerned about when it comes to the open source of large models? 

Meng Wei: I am more concerned about issues such as the commercialization of large open source models and open source protocols. Currently, LLaMA has developed a series of large models of the alpaca family, but there may be some problems in commercialization. Before LLaMA2 came out, some large open source models that could be commercialized were not very easy to use. When large models are open sourced, do they look like traditional code? The problem is not that simple. If you open source a piece of code you wrote yourself, you can directly use the license agreement. However, open source large models include computing power investment, data assets (which may be free data sets and purchased), and involve privacy issues, data circulation issues, and security issues. The problem faced by large open source models is that there is no real standardization and unification. license.

Some time ago, a large model developed by a country in the Middle East and posted on HuggingFace caused widespread controversy. After the Falcon-40B, a large model with 40 billion parameters, was open sourced, the open source license agreement it followed caused an uproar in the open source circle. Most of this license is based on the Apache License Version 2.0 , which is business-friendly. Users can modify the code to meet their needs. and released/sold as open source or commercial products. However, Falcon has modified some license regulations: its large open source model can be used commercially, but if it exceeds a certain amount, you need to pay corresponding fees. This action has caused great controversy. People with an open source utopian spirit believe that this destroys the open source culture. However, I personally think this action is morally incorrect, but it is understandable, because companies also have revenue pressure. The most fundamental reason for this phenomenon is that there is no unified license agreement to stipulate what specifications should be followed for open source large models? And how to avoid privacy issues, legal issues? These are very sensitive, and companies may pay a high price if they step on the mine. 

Where exactly is the open source business model for big models? We've been thinking. 

Some people have analyzed several open source commercialization routes. For example, there are both open source and commercial versions. The open source version is free. If you need more services and advanced functions, you need to purchase the commercial version. This is the first commercial version to charge. The second type is service charges, such as Red Hat, where a lot of revenue comes from its services. I think it's not difficult to speak up if you have commercial appeals. Where is the business model for large model open source? Why should large models be open source? Where is the value? All need to be further discussed to improve the specifications so that the AI ​​open source process can proceed in an orderly manner and form a virtuous cycle. 

In addition, the license agreement of the model that is now open source is in a very vague state. If a legal dispute arises, there are no international cases that can be used for reference. However, the China Academy of Information and Communications Technology and the China Electronics Standardization Institute under the Ministry of Industry and Information Technology, as well as relevant units in various industries, are currently discussing large-scale model licensing agreements, and ZTE is also involved. China has now gone from being heavily involved in the game to formulating the rules of the game, which is a very big step forward.

928d447b171784dd5fd0efab561b922f.png

CSDN: At the current stage, the development of AI technology is very hot. Many people are discussing whether programmers will be replaced. How do you view the relationship between programmers and AI? 

Meng Wei: Frankly, I think replacement is inevitable, although many companies have not explicitly stated that they are doing this work. For companies, labor costs are a huge expense, so they all hope to reduce them. If you can achieve a 10% or 20% reduction in labor costs, it is a very remarkable result, so companies have an incentive to achieve this goal through large models.

There is no need to be too anxious. In every era, some popular industry experts and dominant industries are born. Some time ago, generative AI jobs surged by 20% abroad, and the situation domestically is similar. You can think from another angle, switch tracks, research generative AI, research large models, and programmers can work hard to improve themselves and become an irreplaceable group of people. However, programmers are not the most anxious. Writers should feel a greater sense of crisis, because writing code may be more difficult than writing text. At present, it is very convenient to use AI tools to write text or perform some simultaneous translations. 

CSDN: What stage is China's open source development currently at? What problems need to be solved? 

Meng Wei: China's current open source development is very rapid, which is undeniable. The number of Chinese developers ranks among the top three in the world. But I think it is not yet at a very serious level and we are facing many problems. 

First of all, it is very easy to become a participant, but it is a long way to become a rule maker. Open source gameplay was first popular abroad, including licensing agreements and rules of procedure. A long time ago, China did not have its own licenses, such as Apache and GPL, these licenses were all derived from abroad. However, our country has gradually researched and formulated open source license agreements such as the Mulan License, and has gradually moved from a heavily involved role to a top-level rule-making role.

The second problem is that community is greater than code. Many Chinese engineers have taken the high school entrance examination and college entrance examination since they were young, and have always maintained their academic status until they started working as programmers. Community over code means that to form an active community, it's not enough to just put your head down and carve the "woodenware" into a flower. The open source community requires more communication between people. An active community is more important than beautiful code. When attending some standards and open source conferences abroad, the "Mai Ba" at the conference are all foreigners, but Chinese people rarely do this. Most people take notes in the audience. This is a common phenomenon. It is not just communication between Chinese companies or within personnel, but also going out to communicate with outstanding foreign developers. This is where we need to improve in the future. 

The third problem is language, namely English. English is not a problem for most programmers, and all code is written in English. It should be noted that when communicating, do not deliberately care about whether the grammar and accent are correct. When many Chinese people communicate with foreigners, they always worry about whether the grammar is correct and whether the pronunciation is authentic. These issues are not the most important. Speak bravely and foreigners will understand what you want to express. 

CSDN: What would you like to say to developers who want to enter the open source field? 

Meng Wei: Engaging in open source is a very good track, especially in the field of artificial intelligence open source. Open source will quickly promote the progress of artificial intelligence technology. I think emotional value is currently difficult to replace by artificial intelligence. People chat with robots without becoming emotionally attached to them, at least for now. First of all, programmers must step out of their own circles so that their value, knowledge, and code can benefit others, and at the same time let their emotions infect others. This is one of the irreplaceable points for everyone.

Second, you must learn English well. Learning well isn’t just about writing code or writing emails, it’s more about being able to express yourself. At the Linux Foundation's conference in March this year, one thing impressed me the most. A classmate who was not very good at English kept chatting with the boss of the foundation. Later, the classmate said that it was because the boss was very nice, and he and he Communicating together is also to practice speaking English. Many programmers have realized the importance of learning English well. The open source industry will eventually face the world, and communication is indispensable. If you are not proactive, many opportunities may not be available. 

Third, when writing code, comments must be very concise and easy to understand. In the open source field, you are facing people with different cognitive levels or different industries, and these people may not understand why you want to write such a program. A previous R&D leader at ZTE suggested that when writing programs, you must use language that ordinary people can understand, rather than blindly using terminology. If beginners can understand the meaning of the comment you write, then this is a successful comment. You must develop this habit: writing comments is for the convenience of others, and also to allow more people to quickly integrate into your own projects.

Reprint丨CSDN

Editor丨Hu Xinyuan

Related Reading| Related Reading

KCC@Hangzhou-In early autumn, let’s have a competition of intelligence and physical fitness

KCC Singapore invites you to participate in the 2023 MetaTrust CTF Warm-up Tournament

outside_default.png

Introduction to Kaiyuan Society

outside_default.png

Kaiyuan Society was founded in 2014. It is composed of individual members who volunteer to contribute to the open source cause. It is formed based on the principles of "contribution, consensus, and co-governance". It always maintains the characteristics of vendor neutrality, public welfare, and non-profit. It is the first to use "open source governance, International integration, community development, project incubation" is an open source community federation with the mission. Kaiyuan Society actively cooperates closely with communities, enterprises and government-related units that support open source. With the vision of "based on China and contributing to the world", it aims to create a healthy and sustainable open source ecosystem and promote the Chinese open source community to become an active player in the global open source system. Participants and Contributors.

In 2017, the Open Source Society transformed to be composed entirely of individual members, operating in accordance with the governance model of top international open source foundations such as ASF. In the past nine years, it has connected tens of thousands of open source people, gathered thousands of community members and volunteers, hundreds of lecturers at home and abroad, and cooperated with hundreds of sponsors, media, and community partners.

9ebf6a8cdc69979c8e603960ef4e1f3a.gif

Guess you like

Origin blog.csdn.net/kaiyuanshe/article/details/132505509