Burst! Microsoft's new work LongNet: Extend Transformer to 1 billion Tokens - Code World

Burst! Microsoft's new work LongNet: Extend Transformer to 1 billion Tokens

Enterprise 2023-07-11 18:29:41 views: null

NoSuchKey

Guess you like

Origin blog.csdn.net/amusi1994/article/details/131606913

Burst! Microsoft's new work LongNet: Extend Transformer to 1 billion Tokens

The AI circle exploded! Microsoft unblocks Transformer, the sequence length is extended by 1 billion+

The new work of Chen Danqi's team: A single card A100 can train 30 billion parameter models!

Tsinghua Tang Jie's new work WebGLM, 10 billion parameters can be connected to the Internet

Celebrate Windows 10 users to break 1 billion, Microsoft demonstrated the new UI design

3.6 trillion tokens, 340 billion parameters, details of Google's large model PaLM 2 exposed

ZTE’s “Nebula R&D Big Model”: AI programming assistant, 100 billion tokens

Fudan Qiu Xipeng's new work: single-machine fine-tuning of a large model with 65 billion parameters, industry insiders: it is of great significance to the popularization of large models...

AI Daily｜Apple’s new iPad Pro is equipped with an AI chip, DeepSeek-V2 is open source, and a million tokens are only 1 yuan...

New work by Nanda Wang Limin's team | MixFormerV2: The first Transformer-based object tracker that runs in real time on a CPU device!

This Tier 1 with annual sales of 30.9 billion is going to talk about a new business of 100 billion

SEEM: Microsoft's new work based on the CV large model divides the "instantaneous universe"

Tencent Tang Daosheng: With over 100 billion parameters and over 2 trillion tokens, Tencent’s Hunyuan large model is fully open to the industry

Microsoft Azure AI team's new work | Florence-2: Unlocking a new realm of vision, universal perception leads the future!

Microsoft spent $ 1 billion in cooperation with OpenAI, establish AGI

On the girl's birthday, she was called back to work overtime halfway, and she burst into tears on the online taxi ride! But there are no tears in the workplace

With only 2.7 billion parameters, Microsoft releases a new Phi-2 model!

Huawei's latest large model is here! Pangu 3.0 came out, with a scale of 100 billion parameters and 3 trillion tokens, saying "do not write poetry but do things"

Transformer warning: [encoder.embed_tokens.weight] is newly initialized

Microsoft promises to extend Flash's life for a few more months in 2021

African nickel miner Lifezone Metals listed on New York Stock Exchange: market value of $1 billion

If you separate the front and back ends: Add a new login interface for small programs or APPs to obtain tokens, and use Ruoyi's verification method

Challenge the Transformer in the big language model! Microsoft proposes a new RetNet architecture! Reasoning speed increased by 8 times!

AI is playing the King's Glory qualifying match, behind Tencent's new strategy of opening up 10 billion

[Understanding Keystone's Four Tokens]

Li Feifei's new work: using scene graph to generate images

OpenAI's valuation has reached $29 billion: a new round of financing is announced!

extend()

The AI company founded by the Chinese is valued at more than 1 billion a year, and Microsoft and Nvidia are rushing to invest

Note that the new work item

Recommended

Ranking

45 kinds of ultra-wide design patterns!

AI testing, promising now and promising future: The industry’s first AI testing cheats are released

2019-12-08

Summary of 260 common network security interview questions (with answer analysis + supporting materials)

Java front-end compilation and back-end compilation understanding

The difference and connection between YARN and Zookeeper

Database knowledge point accumulation day02

Data structure review-Binary tree traversal (end-of-term series)

PBR流程介绍和模型规范

Inaction Store Information

Daily

More

2025-04-30(0)

2025-04-29(0)

2025-04-28(0)

2025-04-27(0)

2025-04-26(0)

2025-04-25(0)

2025-04-24(0)

2025-04-23(0)

2025-04-22(0)

2025-04-21(0)