Peking University’s big legal model ChatLaw is popular! !

Language big models continue to expand into vertical industry fields, and this time the big model of Peking University is out of the circle.

The big model "exploded" again.

Last night, a large legal model, ChatLaw, topped the Zhihu hot search list. At its peak, the popularity reached around 20 million.

This ChatLaw is released by the Peking University team and is committed to providing inclusive legal services. On the one hand, there is currently a shortage of practicing lawyers across the country, and the supply is far less than the legal demand; on the other hand, ordinary people have a natural gap in legal knowledge and provisions, and are unable to use legal weapons to protect themselves.

The recent rise of large language models provides an excellent opportunity for ordinary people to consult on legal-related issues in a conversational manner.

4efc1969afb5e3b7462be58fbb414ce6.jpeg

Currently, there are three versions of ChatLaw, as follows:

  • ChatLaw-13B, an academic demo version, is trained based on Jiang Ziya's Ziya-LLaMA-13B-v1 and performs very well in all aspects of Chinese. However, logically complex legal questions and answers are not effective and require a model with larger parameters to solve;

  • ChatLaw-33B, also an academic demo version, is trained based on Anima-33B, and its logical reasoning ability is greatly improved. However, because Anima has too little Chinese corpus, English data often appears during Q&A;

  • ChatLaw-Text2Vec, using a data set made of 930,000 judgment cases, trained a similarity matching model based on BERT, which can match user question information with the corresponding legal provisions.

According to the official demonstration, ChatLaw supports users to upload legal materials such as documents and recordings, helping them summarize and analyze, and generate visual maps, charts, etc. In addition, ChatLaw can generate legal advice and legal documents based on facts. The project's star count on GitHub has reached 1.1k.

0af44e8e23843b6a4bafdfe4311f8e29.png

Official website address: https://www.chatlaw.cloud/

Paper address: https://arxiv.org/pdf/2306.16092.pdf

GitHub address: https://github.com/PKU-YuanGroup/ChatLaw

Currently, due to the popularity of the ChatLaw project, the server temporarily crashed and the computing power has reached the upper limit. The team is working on a fix, and interested readers can deploy the beta model on GitHub.

The editor himself is still in the queue for closed beta testing. So here is an official conversation example provided by the ChatLaw team, about the "seven days no reason return" problem that you may encounter when shopping online. I have to say that ChatLaw’s answers are quite comprehensive.

6d3f987753dee4ba8b635ba07847b75e.png

However, the editor found that the academic demo version of ChatLaw can be tried out. Unfortunately, it does not have access to the legal consultation function and only provides simple dialogue consultation services. Here are a few questions to try.

65aff959615b766a4a6662a29315e333.png

bd4c2b1cbcc458319af1889062fe545e.png

5abd185926c188ea57ff2b58a733da50.png

In fact, Peking University is not the only one that has released large legal models recently. At the end of last month, Power Law Intelligence and Zhipu AI released the 100-billion-parameter-level legal vertical model PowerLawGLM. It is reported that the model has shown unique advantages in its application effect in Chinese legal scenarios.

2f8f492ced4587d844fe75f7c4839db0.png

Image source: Power Law Intelligence

ChatLaw’s data source and training framework

The first is the data composition . ChatLaw data mainly consists of forums, news, legal provisions, judicial interpretations, legal consultations, legal examination questions, and judgment documents. The conversation data is then constructed through cleaning, data enhancement, etc. At the same time, by cooperating with Peking University School of International Law and well-known industry law firms, the ChatLaw team can ensure that the knowledge base can be updated in a timely manner while ensuring the professionalism and reliability of the data. Let’s look at specific examples below.

Examples of construction based on laws, regulations and judicial interpretations:

86f7042042a1326b4b5cd8a01a1a3885.png

Example of grabbing real legal consultation data:

bb7267fb1d24a4403248026e27224721.png

Examples of construction of multiple-choice questions for the bar exam:

8c3cdcf03b6189ad81ccadfc6190e45c.png

Then there is the model level. In order to train ChatLAW, the research team used Low-Rank Adaptation (LoRA) to fine-tune it based on Ziya-LLaMA-13B. In addition, this study also introduced the self-suggestion role to alleviate the problem of model hallucinations. The training process is performed on multiple A100 GPUs, and deepspeed is used to further reduce training costs.

As shown below is the ChatLAW architecture diagram. This research injects legal data into the model and performs special processing and enhancement of this knowledge. At the same time, they also introduce multiple modules during reasoning to integrate the general model, professional model and knowledge base. As one body.

This study also constrained the model in reasoning to ensure that the model generates correct laws and regulations and to reduce model illusions as much as possible.

7f3ec6c95c7092ab837c93e198306c49.png

At first, the research team tried traditional software development methods, such as using MySQL and Elasticsearch for retrieval, but the results were unsatisfactory. Therefore, this study began to try to pre-train the BERT model for embedding, and then used methods such as Faiss to calculate cosine similarity and extract the top k laws and regulations related to the user query.

This approach often produces suboptimal results when the user's problem is unclear. Therefore, researchers extract key information from user queries and design algorithms using vector embedding of this information to improve matching accuracy.

Since large models have significant advantages in understanding user queries, this study fine-tuned LLM to extract keywords from user queries. After obtaining multiple keywords, the study used Algorithm 1 to retrieve relevant legal provisions.

a7bd8736bcaee5bf576ac4daf32d815e.png

Experimental results

The study collected national judicial examination questions for more than ten years and compiled a test data set containing 2,000 questions and their standard answers to measure the model's ability to handle legal multiple-choice questions.

However, the study found that the accuracy of each model was generally low. In this case, comparing accuracy alone doesn't mean much. Therefore, this study draws on the ELO matching mechanism of League of Legends and creates a model-versus-ELO mechanism to more effectively evaluate the ability of each model to handle legal multiple-choice questions. The following are ELO scores and winning rate charts respectively:

b8ac9589780faba8beda0a0cc94f3458.png

By analyzing the above experimental results, we can make the following observations

(1) Introducing data from law-related questions and answers and regulatory provisions can improve the model’s performance on multiple-choice questions to a certain extent;

(2) By adding data for specific types of tasks for training, the performance of the model on this type of tasks will be significantly improved. For example, the reason why the ChatLaw model is better than GPT-4 is that the article uses a large number of multiple-choice questions as training data;

(3) Legal multiple-choice questions require complex logical reasoning. Therefore, models with larger parameters usually perform better.

Reference Zhihu link:

https://www.zhihu.com/question/610072848

Other reference links:

https://mp.weixin.qq.com/s/bXAFALFY6GQkL30j1sYCEQ

Guess you like

Origin blog.csdn.net/spider_py/article/details/131566491