Article directory
large language model
Large language models are an important development trend in the field of machine learning and natural language processing in recent years. Taking the GPT model as an example, explain its development
The GPT series is based on the Transformer architecture and is designed to understand and generate human language. They usually learn various patterns and structures of language by pre-training on a large amount of text data, and then can be fine-tuned to adapt to various specific tasks, such as text classification, sentiment analysis, question answering system, etc. These models have demonstrated remarkable capabilities in understanding complex semantic relationships, handling long-distance dependencies, etc., and promoted the development of natural language processing technology.
GPT-1: Released in 2018, GPT-1 is OpenAI's first language model using the Transformer architecture, with 117 million parameters. It is trained to generate fluent and coherent speech and performs well on a variety of language processing tasks, but can produce repetitive text when processing prompts or long texts that are beyond the range of its training data.
GPT-2: Released in 2019, GPT-2 has 1.5 billion parameters, much larger than GPT-1. It has shown significant improvements in some natural language processing tasks, capable of generating more coherent, realistic text sequences, but has challenges in handling tasks that require more complex reasoning and understanding context.
GPT-3: Released in 2020, GPT-3 has 175 billion parameters, which is more than 100 times larger than GPT-1 and more than 10 times larger than GPT-2. GPT-3 generates complex responses on a range of natural language processing tasks without even being provided with any prior example data. However, GPT-3 still has some problems, such as returning biased, inaccurate or inappropriate responses, or generating text that is completely irrelevant to the prompt, indicating that the model still has trouble understanding context and background knowledge.
GPT-4: Released on March 14, 2023, GPT-4 has a significant improvement on the basis of GPT-3. Although the specific details of the model's training data and architecture have not been released, it is certain that GPT-4 builds on the strengths of GPT-3 and overcomes some of its limitations.
Large language models at home and abroad
Large model list
serial number | company | large model | Provinces and cities | category | official website | illustrate |
---|---|---|---|---|---|---|
1 | baidu | A word from the heart , spiritual doctor Bot | Beijing | universal | ✔ | An account is required for the trial , and there is an APP |
2 | Ali Cloud | Tongyi Thousand Questions , Qwen-7B | Hangzhou, Zhejiang | universal | ✔ | Account required for trial , open source Tongyi Qianwen 7B model Qwen-7B , Qwen-7B-Chat |
3 | HKUST Xunfei | spark | Hefei, Anhui | universal | ✔ | An account is required for the trial , and there is an APP |
4 | Optimistic data | Cao Zhi | Shanghai | finance, industry | ✔ | Account required for trial |
5 | Fudan University | MOSS | Shanghai | research | ✔ | Account required for trial |
6 | Tsinghua University | ChatGLM,NowcastNet | Beijing | research | ✔ | Open source 6B , ChatGLM2-6B , Smart Spectrum AI , weather, and nowcasting large models |
7 | Huawei | Pangu , Pangu weather , Pangu-Σ | Shenzhen, Guangdong | industry | ✔ | Huawei + Pengcheng, Huawei Cloud Pangu |
8 | Zhiyuan Artificial Intelligence Research Institute | Enlightenment·Skyhawk , Enlightenment·EMU | Beijing | universal | ✔ | Enlightenment 3.0, Horizon Vision, AQUILA Aquila, Aquila-7B , AquilaChat-7B , AquilaCode-7B-NV , AquilaCode-7B-TS , HuggingFace , EMU based on LLaMA |
9 | Zhejiang University | Qizhen , PromptProtein , TableGPT | Hangzhou, Zhejiang | vertical | ✔ | The large medical model provides three versions based on LLaMA-7B, CaMA-13B and ChatGLM-6B, which are used in the model of PromptProtein |
10 | Baichuan Smart | Momokawa , Baichuan-7B , Baichuan-13B | Beijing | universal | ✔ | Model download: Baichuan-13B-Base , Baichuan-13B-Chat , Baichuan-7B , open source and commercially available |
11 | Shanghai Artificial Intelligence Laboratory | Scholar·Puyu , OpenMEDLab Puyi | Shanghai | General & Vertical | ✔ | Technical report , open source InternLM-7B , HuggingFace download model weights |
12 | shell | BELLE | Beijing | vertical | ✔ | Multiple models based on BLOOMZ or LLaMA |
13 | Harbin Institute of Technology | Materia Medica , movable type | Harbin, Heilongjiang Province | medicine | ✔ | Medicine, Materia Medica is based on LLaMA ; another Med-ChatGLM based on ChatGLM , movable type based on BLOOM-7B |
14 | Yunzhisheng | mountains and seas | Beijing | medicine | ✔ | |
15 | OpenBMB | CPM,CPM-Bee | Beijing | universal | ✔ | Smart Wall , CPM-Bee-10B |
16 | Hong Kong Chinese Shenzhen | Hua Tuo , Phoenix | Shenzhen, Guangdong | medicine | ✔ | Chinese University of Hong Kong (Shenzhen) and Shenzhen Institute of Big Data, Medicine, Demo , Huatuo and Phoenix are all based on BLOOMZ |
17 | Yuanxiang Technology | XVERSE-13B | Shenzhen, Guangdong | universal | ✔ | Model download |
18 | Tiger Technology | TigerBot | Shanghai | finance | ✔ | Based on BLOOM |
19 | Northeastern University | TechGPT,PICA | Shenyang, Liaoning | research | ✔ | TechGPT->BELLE-> LLaMA , map construction and reading comprehension questions and answers; PICA->ChatGLM2-6B emotional model |
20 | Shanghai Jiaotong University | K2 , Magnolia | Shanghai | K2: Earth Science, Magnolia: Science | ✔ | Demo , GeoLLaMA, based on LLaMA , HuggingFace |
21 | IDEA Institute | Gods List MindBot | Shenzhen, Guangdong | universal | ✔ | Jiang Ziya Series Models |
22 | Du Xiaoman | Xuanyuan | Beijing | finance | ✔ | Based on BLOOM |
23 | 360 | brain , see | Beijing | universal | ✔ | |
24 | iWrite Technology | Anima | Hangzhou, Zhejiang | marketing | ✔ | Based on Guanaco -> LLaMA based , using QLoRA |
25 | School of Information Engineering, Peking University | ChatLaw | Beijing | law | ✔ | ChatLaw-13B基于Ziya-LLaMA-13B-v1->LLaMA,ChatLaw-33B基于Anima33B->Guanaco->LLaMA |
26 | 中国科学院自动化研究所 | 紫东·太初 | 北京 | 通用 | ✔ | 紫东太初2.0号称100B参数,全模态 |
27 | 中国科学院计算技术研究所 | 百聆 | 北京 | 科研 | ✔ | 基于 LLaMA,权重Diff下载7B和13B,demo |
28 | 中国科学院成都计算机应用研究所 | 聚宝盆 | 四川成都 | 金融 | ✔ | 基于LLaMA的金融大模型 |
29 | 晓多科技+国家超算成都中心 | 晓模型XPT | 四川成都 | 客服 | ✔ | 试用申请 |
30 | 网易有道 | 子曰 | 北京 | 教育 | ✔ | 推荐有道速读,读论文的利器 |
31 | 北京语言大学 | 桃李 | 北京 | 教育 | ✔ | 基于LLaMA,北语+清华+东北、北京交大 |
32 | 华南理工大学 | 扁鹊,灵心SoulChat | 广东广州 | 医学 | ✔ | |
33 | 商汤科技 | 日日新 | 上海 | 通用 | ✔ | |
34 | 国家超级计算天津中心 | 天河天元 | 天津 | 通用 | ✘ | |
35 | 北京交通大学 | 致远 | 北京 | 交通 | ✔ | TransGPT・致远,基于LLaMA-7B |
36 | 恒生电子 | LightGPT | 浙江杭州 | 金融 | ✘ | |
37 | 稀宇科技 | MiniMax | 上海 | 通用 | ✔ | GLOW虚拟社交 |
38 | 左手医生 | 左医GPT | 北京 | 医学 | ✔ | 医疗,试用需Key |
39 | 上海科技大学 | DoctorGLM | 上海 | 医学 | ✔ | 医学大模型,论文 |
40 | 华东师范大学 | EmoGPT,EduChat | 上海 | 教育 | ✘ | EmoGPT是上海市心理健康与危机干预重点实验室与镜象科技公司合作完成, 教学教育大模型EduChat基于BELLE(BELLE基于LLaMA) |
41 | 星环科技 | 无涯、求索 | 上海 | 金融 | ✘ | 无涯——金融;求索——大数据分析 |
42 | 澳门理工大学 | XrayGLM,IvyGPT | 澳门 | 医疗 | ✔ | IvyGPT基于ChatGLM2,XrayGLM基于VisualGLM-6B |
43 | 数慧时空 | 长城 | 北京 | 地球科学 | ✘ | 自然资源,遥感 |
44 | 中工互联 | 智工 | 北京 | 工业 | ✘ | 与复旦NLP实验室联合,工业领域 |
45 | 创业黑马 | 天启 | 北京 | 创投 | ✘ | 创业黑马与360合作,科创服务行业 |
46 | 追一科技 | 博文Bowen | 广东深圳 | 客服 | ✘ | |
47 | 智慧眼 | 砭石 | 湖南长沙 | 医学 | ✘ | 医疗领域 |
48 | 香港科技大学 | 罗宾Robin | 香港 | 科研 | ✔ | 基于LLaMA,港科大开源LMFlow |
49 | 昆仑万维 | 天工 | 北京 | 客服 | ✔ | 与奇点智源联合研发 |
50 | 智媒开源研究院 | 智媒 | 广东深圳 | 媒体 | ✔ | 基于LLaMA,面向自媒体 |
51 | 医疗算网 | Uni-talk | 上海 | 医学 | ✘ | 上海联通+华山医院+上海超算中心+华为 |
52 | 蚂蚁集团 | 贞仪 | 浙江杭州 | 金融 | ✘ | 据传语言和多模态两个 |
53 | 硅基智能 | 炎帝 | 江苏南京 | 文旅 | ✘ | |
54 | 西湖心辰 | 西湖 | 浙江杭州 | 科研 | ✔ | |
55 | 拓尔思 | 拓天 | 北京 | 媒体 | ✘ | TRSGPT |
56 | 好未来 | MathGPT | 北京 | 教育 | ✘ | 学而思 |
57 | 清博智能 | 先问 | 北京 | 农业 | ✘ | 基于结构化数据 |
58 | 智子引擎 | 元乘象 | 江苏南京 | 客服 | ✔ | |
59 | 拓世科技 | 拓世 | 江西南昌 | 金融 | ✘ | |
60 | 循环智能 | 盘古 | 北京 | 客服 | ✔ | 循环智能,清华大学,华为 |
61 | 慧言科技+天津大学 | 海河·谛听 | 天津 | 科研 | ✘ | |
62 | 第四范式 | 式说 | 北京 | 客服 | ✔ | |
63 | 字节跳动 | Grace | 北京 | 通用 | ✘ | 内部代号 |
64 | 出门问问 | 序列猴子 | 北京 | 营销 | ✔ | |
65 | 数说故事 | SocialGPT | 广东广州 | 社交 | ✘ | |
66 | 云从科技 | 从容 | 广东广州 | 政务 | ✔ | |
67 | 浪潮信息 | 源 | 山东济南 | 通用 | ✘ | 源 |
68 | 中国农业银行 | 小数ChatABC | 北京 | 金融 | ✘ | |
69 | 麒麟合盛 | 天燕AiLMe | 北京 | 运维 | ✔ | |
70 | 台智云 | 福尔摩斯FFM | 台湾 | 工业 | ✔ | 华硕子公司 |
71 | 医联科技 | medGPT | 四川成都 | 医学 | ✘ | |
72 | 电信智科 | 星河 | 北京 | 通信 | ✘ | 通用视觉,中国电信 |
73 | 深思考人工智能 | Dongni | 北京 | 媒体 | ✔ | |
74 | 文因互联 | 文因 | 安徽合肥 | 金融 | ✘ | 金融大模型 |
75 | 印象笔记 | 大象GPT | 北京 | 媒体 | ✘ | |
76 | 中科闻歌 | 雅意 | 北京 | 媒体 | ✘ | |
77 | 澜舟科技 | 孟子 | 北京 | 金融 | ✔ | |
78 | 京东 | 言犀 | 北京 | 商业 | ✘ | |
79 | 智臻智能 | 华藏 | 上海 | 客服 | ✘ | 小i机器人 |
80 | 新华三H3C | 百业灵犀 | 浙江杭州 | 工业 | ✘ | |
81 | 鹏城实验室 | 鹏城·脑海 | 广东深圳 | 科研 | ✘ | Peng Cheng Mind |
82 | 宇视科技 | 梧桐 | 浙江杭州 | 运维 | ✘ | AIoT行业 |
83 | 理想科技 | 大道Dao | 北京 | 运维 | ✘ | 运维大模型 |
84 | 美亚柏科 | 天擎 | 福建厦门 | 安全 | ✘ | 公共安全 |
85 | 赛灵力科技 | 达尔文 | 广东广州 | 医学 | ✘ | 赛灵力,清华珠三角研究院,赛业生物,大湾区科技创新服务中心 |
86 | 实在智能 | 塔斯 | 浙江杭州 | 客服 | ✘ | TARS |
87 | 佳都科技 | 佳都知行 | 广东广州 | 交通 | ✘ | 交通领域 |
88 | 知乎 | 知海图 | 北京 | 媒体 | ✘ | 知乎和面壁科技合作 |
89 | 网易伏羲 | 玉言 | 广东广州 | 通用 | ✘ | |
90 | 清睿智能 | ArynGPT | 江苏苏州 | 教育 | ✘ | |
91 | 微盟 | WAI | 上海 | 商业 | ✔ | |
92 | 西北工业大学+华为 | 秦岭·翱翔 | 陕西西安 | 工业 | ✘ | 流体力学大模型,湍流+流场 |
93 | 奇点智源 | 天工智力 | 北京 | 通用 | ✔ | 瑶光和天枢 |
94 | 联汇科技 | 欧姆 | 浙江杭州 | 通用 | ✔ | OmModel欧姆多模态(视觉语言)大模型 |
95 | 中国联通 | 鸿湖 | 北京 | 通信 | ✘ | |
96 | 思必驰 | DFM-2 | 江苏苏州 | 工业 | ✘ | |
97 | 中科创达 | 魔方Rubik | 北京 | 工业 | ✘ | |
98 | 电科太极 | 小可 | 北京 | 政务 | ✘ | 党政企行业应用 |
99 | 中国移动 | 九天 | 北京 | 通信 | ✘ | |
100 | 中国电信 | TeleChat | 北京 | 通信 | ✘ | |
101 | 容联云 | 赤兔 | 北京 | 客服 | ✘ | 客服,营销 |
102 | 云天励飞 | 天书 | 广东深圳 | 政务 | ✘ | |
103 | 乐言科技 | 乐言 | 上海 | 客服 | ✘ | |
104 | 沪渝人工智能研究院 | 兆言 | 重庆 | 科研 | ✘ | 也称:上海交通大学重庆人工智能研究院 |
105 | 中央广播电视总台 | 央视听 | 北京 | 媒体 | ✘ | 央视听媒体大模型CMG Media GPT |
106 | 超对称技术公司 | 乾元 | 北京 | 金融 | ✔ | |
107 | 蜜度 | 文修 | 上海 | 媒体 | ✘ | 智能校对 |
108 | 中国电子云 | 星智 | 湖北武汉 | 政务 | ✘ | 政务大模型 |
109 | 理想汽车 | MindGPT | 北京 | 工业 | ✘ | |
110 | 阅文集团 | 妙笔 | 上海 | 文旅 | ✘ | 网文大模型 |
111 | 携程 | 问道 | 上海 | 文旅 | ✘ | 旅游行业大模型 |
112 | 腾讯 | 混元 | 广东深圳 | 通用 | ✘ | |
113 | 瑞泊 | VIDYA | 北京 | 工业 | ✔ | |
114 | 有连云 | 麒麟 | 上海 | 金融 | ✘ | |
115 | 维智科技 | CityGPT | 上海 | 公共服务 | ✘ | 城市大模型 |
116 | 用友 | YonGPT | 北京 | 企业服务 | ✘ | |
117 | 天云数据 | Elpis | 北京 | 金融 | ✘ | 证券法律法规 |
118 | 孩子王 | KidsGPT | 江苏南京 | 教育 | ✘ | |
119 | 企查查 | 知彼阿尔法 | 江苏苏州 | 商业 | ✘ | |
120 | 今立方 | 12333 | 福建厦门 | 政务 | ✘ | 人社领域 |
121 | 阳光保险集团 | 正言 | 广东深圳 | 金融 | ✘ | |
122 | 电科数字 | 智弈 | 上海 | 水利 | ✘ | |
123 | 聆心智能 | CharacterGLM | 北京 | 游戏 | ✘ | |
124 | 大经中医 | 岐黄问道 | 江苏南京 | 医疗 | ✘ | |
125 | 蒙牛 | MENGNIU.GPT | 内蒙古呼和浩特 | 食品 | ✘ | |
126 | 快商通 | 汉朝 | 福建厦门 | 营销 | ✘ | |
127 | 众合科技 | UniChat | 浙江杭州 | 交通 | ✘ | |
128 | 金蝶 | 苍穹 | 广东深圳 | 企业服务 | ✘ | |
129 | 云问科技 | 云中问道 | 江苏南京 | 营销 | ✘ | 与西安未来AI计算中心联合发布 |
130 | 天壤智能 | 小白 | 上海 | 通用 | ✘ | |
131 | 小米 | MiLM-6B | 北京 | 商业 | ✘ | |
132 | 长虹 | 长虹超脑 | 四川绵阳 | 媒体 | ✘ |
国外大模型
公司 | 大模型 | 说明 |
---|---|---|
OpenAI | ChatGPT | |
微软 | Bing Chat | |
PaLM2,Bard,Gemini | Bard支持图片 | |
Anthropic | Claude | Claude 2,支持读入pdf、txt、csv等文件进行分析、总结和问答等 |
Meta | LLaMA,LLaMA-2 | |
Stability AI | StableLM | |
Amazon | Titan | |
Bloomberg | BloombergGPT | |
MosaicML | MPT | |
Intel | Aurora genAI | |
UC Berkeley, Microsoft Research | Gorilla | |
inflection.ai | Inflection-1 | |
xAI | 从OpenAI 到xAI | |
cohere | Cohere | |
Scale AI | Scale | |
character ai | Character | |
Colossal-AI | ColossalChat |