The ChatGPT-like model that raised $100 million is open source! Available for commercial use, 8 models

Last Tuesday, "AIGC Open Community" introduced Writer, a generative AI platform that raised US$100 million. The company's ability to obtain a total financing of US$126 million in just three years and become one of ChatGPT's main competitors is inseparable from its superb technology. It also fully proves that its model has successful application cases and has been recognized by capital and users.

Currently, Writer has open sourced the large language model Palmyra it uses on huggingface. There are 8 models, namely small, base, 20b-chat, Instruct-20b, med-20b, etc., which are commercially available and support data fine-tuning.

Open source address: https://huggingface.co/Writer

Online free trial address: https://app.writer.com/organization/
Insert image description here

Palmyra's technical highlights include: small parameters and powerful functions, which are very helpful for small and medium-sized enterprises and individual developers without computing resources; it has received training in business writing and marketing data, mainly for enterprise users; enterprise-level data security, with built-in multiple safety guardrails ;

In addition to generating text, it can also extract content summaries of videos, PDFs, and audios; it supports data fine-tuning, and enterprises can create their own "ChatGPT" assistant, etc.

Insert image description here

The following "AIGC Open Community" introduces several special models of Palmyra.

InstructPalmyra-20b

This is an instruction tuning model built on the Palmyra-20b basic model, supporting advanced natural language processing and tailored needs.

The InstructPalmyra-20b model was meticulously trained on an extensive data set of approximately 70,000 command-response records. These records are generated by Writer's professional language modeling and fine-tuning technical team.

Insert image description here

InstructPalmyra-20b has an excellent ability to process complex instructions and generate accurate, contextual responses. This makes it an ideal model for developing a wide range of applications such as virtual assistants, customer support, content generation, and more.

In addition, the model's comprehensive training enables it to adapt and perform well under different conditions and contexts, further expanding its potential use cases.

Palmyra-with-20b

Palmyra-Med is Writer's model built specifically to meet the needs of the healthcare industry, with instructions fine-tuned based on medical data.

Palmyra-Med achieved top scores when tested on leading biomedical question answering PubMedQA, with an accuracy rate of 81.1%, outperforming GPT-4 and medically trained human testers.

Insert image description here

It can provide functions such as translating professional medical terminology, extracting summary of medical notes, analyzing massive medical data, and automatically generating medical insights.

Palmyra Large 20B

Palmyra-Large is a causal decoder model built by Writer, enhanced by Palmyra-Index-Data and trained on 800 billion data in a high-quality corpus.

Palmyra Large uses a causal language modeling (CLM) objective during model pre-training. Similar to GPT-3, it is therefore pre-trained with the goal of self-supervised causal language modeling.

Insert image description here

This model runs very quickly and consumes very little resources. It is suitable for business scenarios such as medical care, marketing, marketing, IT, design, and human resources to create tailor-made AI assistants.

Performance evaluation

Palmyra received the highest score on Stanford HELM, surpassing well-known open source models such as Falcon 40B and LLaMA-30B. HELM is a very well-known benchmark testing platform from the Fundamental Model Research Center of Stanford University.
Insert image description here

Palmyra ranked first on several important tests, scoring 60.9% on Massive Multi-Task Language Understanding (MMLU), 89.6% on BoolQ, and 79.0% on NaturalQuestions.

Palmyra ranked second in two other key tests, with a contextual Q&A score of 49.7% and a TruthfulQA score of 61.6%. The overall performance is very strong.

In short, Palmyra is very worthy for developers who want to commercialize large language models to study its model architecture and functions and learn from its successful experience.

The material of this article comes from Writer’s official website. If there is any infringement, please contact us to delete it.

Supongo que te gusta

Origin blog.csdn.net/weixin_57291105/article/details/133272143
Recomendado
Clasificación