About formulating AI open source license: the game between law and ethics

a2159b1f37bea13efbf99f8c5119eed5.png

d25de77b3a6dc4a61c8612cadda1b48f.jpeg

Artificial intelligence is sweeping the world at lightning speed. AI open source licenses and standards are like channels in this wave, guiding the healthy, safe and orderly open development of AI. But there are also some controversies: Does the formulation of licenses rely on ethical consensus or perfect laws and regulations?

CSDN's "Open Talk" column, with the theme of "Open Source Protocol in the AGI Era" , invited Liu Tiandong, co-founder of Kaiyuan Society and official member of the Apache Software Foundation; Meng Wei, ZTE Open Source Strategy Director and Chairman of the Board of Directors of LF AI & Data; LF AI & Data TAC member and founder of the Generative AI Committee, Tan Zhongyi, used the open source protocol of AI as an entry point to discuss the development trend of AI governance.

Liu Tiandong discussed the paradox of license technology and ethics from the perspective of open source collaboration and sharing, which was thought-provoking. Meng Wei analyzed the differences between AI open source and traditional open source, helped people clarify the challenges of AI open source, and quietly embedded AI into the future blueprint of 6G communications. Tan Zhongyi focuses on large model applications and strives to fully demonstrate the power of AI in various fields.

Let us enter the fantasy journey of AI open source together and see what wonderful chapters it brings.

A sneak peek of exciting perspectives: 

  • Meng Wei: Traditional open source usually covers the intellectual output of programmers and is the crystallization of their brain work. In AI open source, this situation is more complicated. It covers not only the output of human wisdom or mental work, but also includes two important aspects: data and computing power.

  • Liu Tiandong: This is the call of artificial intelligence. You should not answer, but you will definitely answer, because curiosity cannot be curbed. This is human nature.

  • Tan Zhongyi: If an industry lacks rules, people will be cautious, but with regulations, it means that they can move forward more actively. This will promote the prosperity of TOB (Business to Business) and TOC (Business to Consumer) applications. For many years to come, applying the capabilities of large models to all walks of life will be a hot issue.

5b5c4ed91950528eb8a144ebba9600b1.jpeg

CSDN: What do you think of AI open source? How is it different from traditional open source?

Meng Wei: Traditional open source usually only covers the intellectual output of programmers, which is the crystallization of their brain work. Open source in the AI ​​field is more complex. It not only covers the output of human wisdom or mental work, but also includes two important aspects: data and computing power.

Data plays a key role in AI open source, and large models lacking data support are almost impossible to build. Data not only represents the product of individual mental work, but also involves a series of issues such as human privacy, ethics, and compliance. This makes data an ethical and compliance consideration that cannot be ignored in the open source world.

AI open source also involves computing power issues. Before the rise of large models, open source projects mainly originated from more technologically advanced countries such as Europe and the United States. With the rise of large models, oil-producing countries in the Middle East have invested a lot of money to support computing resources and promoted the research and development of large models. Computing power may also affect the balance of open source projects and bring certain cost challenges.

CSDN: How is the open source license established?

Meng Wei: The licensing process is similar to the standard-setting process. It usually involves soliciting opinions multiple times, culminating in a widely agreed-upon version. This process may be organized by a leading unit, which may be a civil society organization or an official agency. This process is similar to the development of industry standards and is designed to ensure that the content of the license has broad applicability and acceptance.

A license is more commonly regarded as an agreement or contractual agreement in China, similar to the user agreement encountered when logging in to a website. When we click "Agree", we actually enter into an agreement with the website, that is, we agree to abide by the content of its agreement. Different from this, in some Western countries, especially the United States, licenses are more related to copyright and intellectual property rights. There is still a certain issue as to whether it is a "contract" or a simple "license". controversy.

CSDN: Currently, there are more and more large open source language models in the industry, and their performance is becoming more and more powerful. However, the open source agreements of some well-known large models such as LLaMa and Falcon-40B have frequently caused controversy. What are the open source licenses and regulatory standards for AI in the industry?

Liu Tiandong: We can divide regulations into different levels. The highest level of laws and regulations usually have a higher level of consistency because they have gone through multiple layers of legislation and approval processes and are more sustainable. Then there are standards, which change quickly and have relatively low consistency at intermediate levels. Next is the license, which is more flexible and comes in many types, including open source licenses. Next is a customized business contract that can be modified as needed. Finally, there are customizable protocols suitable for different scenarios, and these rules may change as the times change.

The rise of artificial intelligence has brought many new challenges, and the ethical concepts of different cultures and regions may differ. Within the open source community, discussions about ethical norms have also caused much controversy. The question is, who should define ethics? Western or Eastern, or other cultures? Finding a balance between ethical concepts and data privacy security in different regions is an urgent issue currently.

Tan Zhongyi: Hugging Face (one of the world's largest model hosting platforms) hosts many models, which all follow different licenses and can be roughly divided into three categories: The first category is traditional open source software licenses, such as GPL, LGPL , AGPL, etc.; the second category is licenses related to Creative Commons and Creative Commons, such as the CC series. These licenses are usually popular in the fields of pictures, audio, video, etc., covering different conditions such as sharing and commercial use; the third category is Licenses specific to models and data, such as "BigScience OpenRAIL-M", "CreativeML OpenRAIL-M" , etc. At present, there are relatively few legal actions related to license violations of models and data. The development of licenses is at a stage of coexistence of diversity. However, with the development of artificial intelligence, legal cases in this area will gradually increase, pushing the industry to further Standardization and development.

b44d4b25258ae62626370ab885ed679e.jpeg

CSDN: How to balance global applicability and regional differences when formulating open source licenses? Will there be a globally recognized unified standard in the future?

Tan Zhongyi: China has gradually improved its industrial development rules in the past few decades. If an industry lacks rules, people must be cautious. With regulations, it means that they can move forward more actively and promote the prosperity and development of ToB and ToC applications, which will give a huge boost to the development of domestic artificial general intelligence.

Developing a universal license for AI models is difficult. Because the model reflects people's values, and values ​​have regional characteristics, it is difficult to take into account global and universal applicability. So I think that model licenses may be global in some aspects, and need to be formulated based on the characteristics of different regions in some aspects. Therefore, it is not practical to formulate a perfect license that theoretically adapts to the laws of all regions. Even if it is formulated, it may not be actually applied and popularized. The industry now needs a license that meets the needs of the industry and can be easily understood and applied by developers, upstream and downstream users. So I think it is not necessary to pursue the creation of an ideal license that is perfect and suitable for all situations. It is more important to solve current problems, promote the development of the AI ​​industry, and move toward positive iterations.

Liu Tiandong: In the EU's Artificial Intelligence Act and China's Generative Artificial Intelligence Management Measures, there are quite strict restrictions on the management of data and generative artificial intelligence. Therefore, the open source license does not need to cover the above repeated content, but should focus on the open source itself, leaving the law to the law and ethics to ethics. Whether it is software or large models, openness should be encouraged. Open source model licenses should follow simple and clear principles and encourage everyone to share and disseminate software and models. Whether it's for training, retraining, or redistribution, it should be free.

When it comes to data, especially data involving personal privacy, no matter which country it is, it needs to be more cautious and consider the privacy and security of the data. At the same time, don’t confuse issues of open source software and data privacy. Open source software and models should remain open, while data privacy issues can be protected through national regulations and accountability.

Meng Wei: New technologies may be chaotic when they first begin to develop, due to competing opinions and interests, but as time goes by, they will definitely become unified. Follow the law that if things are separated for a long time, they must be combined, and if they are combined for a long time, they must be divided.

Regarding AI licenses, China has already begun to take action. There are already two major standards organizations formulating licenses for open source large models, such as the "Paper Kite" open artificial intelligence model license and the Mulan series license. Among them, the open source data license has been relatively Complete. As related issues such as large model licenses gradually come to the fore, some cases and cases will make people more aware of the importance of the problem. We are already actively exploring and resolving these issues to ensure the healthy development of the open source field.

CSDN: How can companies help their industries to implement better when choosing open source models and protocol licenses?

Tan Zhongyi: Although there are some disputes about certain licenses, at present, it is the easiest choice for AI to uniformly use Apache License 2.0 , whether it is code, model or data. Because this is a license that is widely recognized in the open source software world as a business-friendly license that takes into account the interests of software authors and users, and is also the cheapest to understand. Using it is relatively the most convenient for developers to adopt.

However, when companies choose software, models and data, licenses are only a small part. The first thing to consider is whether it can solve the company's problems. If it doesn't solve the problem itself, companies won't adopt it even if it uses a friendly license. Therefore, the most critical thing is to meet the needs of developers within the cost range, and the license is only one influencing factor.

d78579c6e2afb29f93cf2ecb3eab3493.jpeg

CSDN: Too powerful AI can easily make humans feel a sense of crisis. How do you view this kind of dilemma?

Meng Wei : This is not only a challenge, but also an opportunity. There are always some areas where machines cannot replace humans, such as interpersonal relationships and emotional communication. For example, there is emotional communication between me and Teacher Tan outside of work. This kind of friendship between comrades cannot be replicated by machines. In a future with the rapid development of artificial intelligence, we need to give full play to our emotional value instead of just doing labor mechanically. I encourage programmers to come out more and join various friend circles to communicate and learn from each other, not only in terms of knowledge exchange, but also in sincere emotional interaction.

Liu Tiandong : People will not be replaced. You should explore, accept and integrate outwards. The only way out for humans in the future is to transform themselves and enjoy the intelligence of AI and the longevity of machines at the same time, so that they can conquer the stars and the sea. But now we can only move in new directions and explore the unknown. Face the future with courage and reject pessimism.

Many experts in the field of artificial intelligence and deep learning experts have united to call for stopping the rapid development of artificial intelligence, but I think such calls are not of much use. Just like the warning in the famous science fiction novel "The Three-Body Problem": Don't answer! Don't answer! But we still won't stop exploring. The same goes for experts’ calls for artificial intelligence. You shouldn’t answer, but you will, because curiosity is unstoppable, and that’s human nature. Be brave enough to face the unknown instead of avoiding it. To combine artificial intelligence with humans and embrace the future, I think we need to have an optimistic attitude. 

CSDN: What issues are you concerned about in the future of AI open source?

Tan Zhongyi : Currently I am very concerned about the application development of large models, namely LLMOps. In the field of large models, there are relatively few people who can develop basic models, and there are not many people who can develop industry models. More work is to apply the capabilities of large models to all walks of life and integrate them with existing software. The combination of applications is called large model application development. The Xingce community has been organizing such activities recently, inviting peers engaged in large model application development to share their experiences and exchange experiences. I think this will be a hot issue for many years to come.

Meng Wei : As far as my communications industry is concerned, our focus has gradually shifted from general large models to how to apply them to the communications industry. Especially in the evolution process from 5G to 6G, how to integrate artificial intelligence and its capabilities (algorithms, computing power, data, etc.) into the 6G network has become our current focus of research. 

Liu Tiandong : Let's cross the boundaries together. This is what I am doing and it is also the mission of Kaiyuan Society. I recently attended some international conferences and found that representatives from Asia, especially China, have too few voices. I hope that China’s voice can be heard by more international foundations, open source communities, government agencies and enterprises, and open source will be spread.

Click at the end of the article to read the original text and watch the live replay.

Reprinted from | CSDN

Editor丨Wuriliga

Related Reading | Related Reading

KCC@Dalian | A private brainstorming session about open source business

ASF Generative Tools Guide (Version 1.0)

outside_default.png

Introduction to Kaiyuan Society

outside_default.png

Kaiyuanshe (English name: "KAIYUANSHE") was established in 2014. It is an open source community composed of individual volunteers who volunteer to contribute to the open source cause and based on the principles of "contribution, consensus, and co-governance  " . Kaiyuan Society has always maintained the concept of "vendor neutrality, public welfare, and non-profit", with the vision of "based on China, contributing to the world, and promoting open source as a way of life in the new era" , and with "open source governance, international integration, community development, and project incubation" Our mission is to create a healthy and sustainable open source ecosystem.

Kaiyuan Society actively cooperates closely with communities, universities, enterprises and government-related units that support open source. It is also the first member of OSI, a global open source protocol certification organization, in China.

Since 2016, it has held the China Open Source Annual Conference (COSCon) continuously, continuously released the "China Open Source Annual Report", and jointly launched the "China Open Source Pioneer List", "China Open Source Code Power List", etc., which has had a wide impact at home and abroad. force.

e351d4eb8eb6fce2e268c65898b45e09.gif

Guess you like

Origin blog.csdn.net/kaiyuanshe/article/details/132928772