Alibaba Cloud Tongyi Qianwen 14B model is open source! Performance surpasses Llama2 and other models of the same size

On September 25, Alibaba Cloud open sourced Tongyi Qianwen's 14 billion parameter model Qwen-14B and its conversation model Qwen-14B-Chat, which are free for commercial use. Qwen-14B surpasses models of the same size in multiple authoritative evaluations, and some indicators are even close to Llama2-70B. Alibaba Cloud has previously open sourced the 7 billion parameter model Qwen-7B, etc., and the number of downloads exceeded 1 million in more than a month, becoming a well-known work in the open source community.

Qwen-14B is a high-performance open source model that supports multiple languages. Compared with similar models, it uses more high-quality data. The overall training data exceeds 3 trillion Tokens, making the model more powerful in reasoning, cognition, and planning. and memory ability. Qwen-14B supports a maximum context window length of 8k.

picture

Figure 1: Qwen-14B surpasses large SOTA models of the same size in all aspects in twelve authoritative evaluations

Qwen-14B-Chat is a dialogue model obtained through fine SFT on the base model. With the powerful performance of the base model, the accuracy of the content generated by Qwen-14B-Chat has been greatly improved, and it is more in line with human preferences. The imagination and richness of content creation have also been significantly expanded.

Qwen has excellent tool calling capabilities, allowing developers to build Qwen-based Agents faster. Developers can use simple instructions to teach Qwen to use complex tools, such as using the Code Interpreter tool to execute Python codes to perform complex mathematical calculations, data analysis, chart drawing, etc.; they can also develop "advanced digital tools" with capabilities such as multi-document Q&A and long text writing. assistant".

Large language models with parameter levels of less than 10 billion are currently the mainstream choice for developers for application development and iteration. Qwen-14B further improves the performance upper limit of small-size models, breaking out from many models of the same size, and has strong performance in MMLU, C-Eval , GSM8K, MATH, GaoKao-Bench and other 12 authoritative evaluations, achieving the best results, surpassing all SOTA (State-Of-The-Art) large models in the evaluation, and also surpassing Llama-2-13B in all aspects. Compared with Llama 2 The 34B and 70B models are not inferior either. At the same time, Qwen-7B has also been completely upgraded, with core indicators increased by up to 22.5%.

picture

Figure 2: Qwen-14B performance surpasses models of the same size

Users can directly download the model from the Moda community, or access and call Qwen-14B and Qwen-14B-Chat through the Alibaba Cloud Lingji platform. Alibaba Cloud provides users with a full range of services including model training, inference, deployment, fine-tuning, etc.

In August, Alibaba Cloud’s open-source Tongyi Qianwen 7 billion parameter base model Qwen-7B made it to the trending lists of HuggingFace and Github. In just over a month, the cumulative downloads exceeded 1 million. More than 50 models based on Qwen have appeared in the open source community, and many well-known tools and frameworks in the community have integrated Qwen.

picture

Tongyi Qianwen is the most deeply implemented and widely used large-scale model in China. There are many domestic applications with over 100 million monthly sales connected to Tongyi Qianwen. A large number of small and medium-sized enterprises, scientific research institutions and individual developers are using Tongyi Qianwen based on Tongyi Qianwen. Ask about the development of exclusive large-scale models or application products, such as Alibaba's Taobao, DingTalk, and Future Elf, as well as external scientific research institutions and start-up companies.

Zhejiang University and Higher Education Press developed a large-scale Zhihai-Sanle education vertical model based on Qwen-7B. It has been applied in 12 universities across the country and can provide intelligent question and answer, test question generation, learning navigation, teaching evaluation and other capabilities. The model has been used in Alibaba The Yunlingji platform provides services to the outside world and can be called with one line of code; Zhejiang Youlu Robot Technology Co., Ltd. has integrated Qwen-7B into the road cleaning robot, allowing the robot to interact with users in real time in natural language and understand the needs put forward by users. Analyze and dismantle the user's high-level instructions, perform high-level logical analysis and task planning, and complete cleaning tasks.

Alibaba Cloud CTO Zhou Jingren said that Alibaba Cloud will continue to embrace open source and openness and promote the construction of China's large model ecosystem. Alibaba Cloud firmly believes in the power of open source and takes the lead in open source self-developed large models, hoping to bring large model technology to small and medium-sized enterprises and individual developers faster.

Alibaba Cloud also took the lead in building China's largest AI model open source community, ModelScope, uniting the power of the entire industry to jointly promote the inclusiveness and application of large model technology. In the past two months, the number of model downloads in the Moda community has soared from 45 million to 85 million, an increase of nearly 100%.

Attached:

Magic community model address:

https://www.modelscope.cn/models/qwen/Qwen-14B-Chat/summary

https://www.modelscope.cn/models/qwen/Qwen-14B/summary

Magic community model experience:

https://modelscope.cn/studios/qwen/Qwen-14B-Chat-Demo/summary

Alibaba Cloud Lingji Platform Address:

https://help.aliyun.com/zh/dashscope/developer-reference/tongyi-qianwen-7b-14b-api-detailes

https://dashscope.console.aliyun.com/model

Qwen paper address:

https://qianwen-res.oss-cn-beijing.aliyuncs.com/QWEN_TECHNICAL_REPORT.pdf

Github:

https://github.com/QwenLM/Qwen

HuggingFace:

https://huggingface.co/Qwen/Qwen-14B

https://huggingface.co/Qwen/Qwen-14B-Chat

Guess you like

Origin blog.csdn.net/FL63Zv9Zou86950w/article/details/133278586