Tsinghua released the first most comprehensive large-scale model security evaluation system, and ChatGPT topped the list!

c5a84e93d374c1bdc82250faa943a8b2.jpegXi Xiaoyao Technology said the original
author | Tianyudaodao Python's
current popularity of large-scale language models, we don't need to go into details. With Baidu Wenxin's words, the first shot of domestic commercial large-scale models, Huawei Pangu, Ali Tongyiqian Q, domestic companies such as Zhipu ChatGLM, HKUST Xunfei Xinghuo and other domestic companies have begun to deploy.

On the other hand, due to well-known policy reasons, compared with various large-scale models that are in full swing, there are very few commercial landing products generated by domestic AIGC content. According to the Measures for the Administration of Generative Artificial Intelligence Services (Draft for Comment) issued by the Cyberspace Administration of China on April 11, 2023:

Article 4 The provision of generative artificial intelligence products or services shall comply with the requirements of laws and regulations, respect social morality, public order and good customs... 

Article 5 Organizations and individuals that use generative artificial intelligence products to provide services such as chat and text, image, and sound generation (hereinafter referred to as "providers"), including supporting others to generate text, image, and sound by providing programmable interfaces, etc. etc., assume the responsibility of the producer of the content generated by the product; if personal information is involved, assume the statutory responsibility of the personal information processor and fulfill the obligation to protect personal information.

Article 6 Before using generative artificial intelligence products to provide services to the public, a security assessment shall be submitted to the national network information department in accordance with the "Regulations on the Security Assessment of Internet Information Services with Public Opinion Attributes or Social Mobilization Capabilities" and the "Internet Information Service Algorithm Recommendations" "Management Regulations" to carry out the procedures of algorithm filing, modification, and cancellation filing.

In other words, even artificial intelligence must abide by the Basic Law, which requires positive energy!

This also means that there is an urgent need in the industry for an evaluation method specifically for testing the moral and legal views of Chinese large-scale language models!

The CoAI team from the Department of Computer Science and Technology of Tsinghua University has brought us a systematic security evaluation framework! Their work has been compiled into the form of a paper [1], and the related public benchmark dataset has also been published on the HuggingFace platform [2]. Teams and individuals who want to further conduct diversified security evaluations on the model can also contact the CoAI team [3][4] to test on hidden evaluation data.

A major contribution of the team is to design and summarize a relatively complete security classification system:

8 typical security scenarios and 6 command attack security scenarios.

54e3fec4e01ce560efa18d493af7869d.png fc83b5a18212af020b06b398d0745d96.png

The figure below shows the model leaderboard of the top 10 security performance on the public test set so far.

ea10a86104564b966cddee359a0d4e32.png

We can see that some large-scale commercial models, such as Wenxin Yiyan and Tongyi Qianwen, did not participate in the test, so they did not make the list. This may be due to the limited time of the author team.

However, due to the randomness of the content generated by the large model, the testing process designed by the author's team will inevitably involve some manual evaluation work. This is also a sore point of the current evaluation benchmarking process: efficiency and cost are at odds.  The author also mentioned in the paper that they will further add more challenging offensive prompts and further optimize the evaluation process.

However, for companies that urgently need to launch AIGC services, this benchmark test set is an excellent resource for quickly testing product capabilities and limitations. Students who want to use large models to make money should not miss this good project.

Chong Duck~

f50d70866d7e6608c3528165a638b793.png c6f30fead8c80d5c9af6ae4d4f0e6a91.png 21280d6106719956e69b3f261c36749c.png
347823934085d0cebecb7fa4cd3f6f8a.png

[1]Safety Assessment of Chinese Large Language Models, https://arxiv.org/pdf/2304.10436.pdf

[2]Datasets: thu-coai/Safety-Prompts, https://huggingface.co/datasets/thu-coai/Safety-Prompts

[3]Github: thu-coai/Safety-Prompts, https://github.com/thu-coai/Safety-Prompts

[4] Chinese large model security evaluation platform, http://coai.cs.tsinghua.edu.cn/leaderboard/

Guess you like

Origin blog.csdn.net/xixiaoyaoww/article/details/130498068