Is Baidu's code writing assistant "Comate" based on large models really easy to use?

Click on the blue word to follow us

Follow and star

never get lost

Institute of Computer Vision

9a91c68967d66d097fe8835e0fdb02e2.gif

9ff21a29fdbbcfd0e892797144ba03c1.gif

Public IDComputer Vision Research Institute

Learning groupScan the QR code to get the joining method on the homepage

Computer Vision Research Institute column

Column of Computer Vision Institute

On June 6, at the Wenxin Large-scale Model Technology Exchange Conference (Chengdu), Baidu Smart Cloud launched the "Comate" code assistant, and officially opened invitation testing. With the help of the understanding and reasoning capabilities of Wenxin's big model, "Comate" can realize fast code completion, natural language code recommendation, automatic code error finding, and comprehensively improve the developer's R&D efficiency. In the future, developers can use the "Comate" code assistant in mainstream development software through plug-ins and other forms. There are already many code assistant tools on the market, will Baidu stand out?

423df4887a1090c99390e926edbbbdc7.gif

01

background

As early as June 2021, in order to meet the future large-scale model training tasks, Baidu Smart Cloud began to plan the construction of a new high-performance GPU cluster. Together with NVIDIA, it completed the IB network architecture design that can accommodate 10,000 cards or more. The nodes in the cluster Each GPU card in the room is connected through the IB network, and the cluster construction will be completed in April 2022, providing single-cluster EFLOPS-level computing power.

In March 2023, Wenxin Yiyan was born on this high-performance cluster, and iteratively developed new capabilities. At present, the size of this cluster is still expanding. Dr. Lai Junjie, General Manager of Solutions and Engineering, NVIDIA China: GPU clusters interconnected by high-speed IB network are the key infrastructure in the era of large models. The largest high-performance GPU/IB cluster in the domestic cloud computing market jointly built by NVIDIA and Baidu Smart Cloud will accelerate Baidu's breakthrough in the field of large models.

a02d8b373b0e96e066553f72832c857e.png

5566fe691ed5a838732bdd878cbfcfaf.png

  • Covering the whole life cycle of large models - more comprehensive and more comprehensive

Provide comprehensive functional services for data labeling, model training and evaluation, reasoning services and application integration

  • Significantly improved training and inference performance - more efficient and more efficient

The training performance of MLPerf list is world-leading, and the acceleration capability of distributed parallel training of 100 billion models and the utilization rate of computing power have been greatly improved

  • Rapid application orchestration and plug-in integration - more open and more open

Preset Baidu Wenxin large models and third-party large models, support flexible arrangement of plug-ins and applications, and help large models to be applied in multiple scenarios

  • Built-in sensitive word filtering - safer and more secure

Perfect authentication and flow control security mechanism, built-in sensitive word filtering, double guarantee of machine review and human review

Built-in Wenxin large model base

  • technology leadership

    Knowledge-enhanced large model, unified paradigm supports multiple types of downstream tasks

    Advanced parallel strategy supports large model training, compression and deployment

    Controllable and reliable language understanding and generation capabilities

  • Full scene coverage

    Support dialogue interaction, free question and answer, copywriting and other capabilities

    Covering energy, finance, aerospace, industry, media and other fields

  • Low threshold and easy to use

    One line of code to call the service

    One-click automatic model fine-tuning

    A small amount of data to complete the implementation of multi-scenario AI applications

  • Real and landable

    Provide enterprise-level one-stop customer service

    Get through the four-tier architecture of chip + platform + model + application

    Cooperate with multiple partners to achieve end-to-end application landing

02

Large Model Code Assistant

With the increasing demand for digital transformation, more and more applications of AI in enterprises, high threshold for AI development, complex and diverse application scenarios, and dependence on scene annotation data have become challenges for the large-scale implementation of AI, while pre-training large models The emergence of artificial intelligence has brought new opportunities and hopes.

As an important starting point for the government and enterprises to promote the development of the artificial intelligence industry, large models have shown significant advantages and great potential in the generalization, versatility, and migration of AI tasks such as recognition, understanding, decision-making, and generation. It is no longer a fantasy if programmers have a code assistant who can easily and accurately assist in completing some repetitive, simple, and trivial tasks.

Now, more and more developers need to use this must-have tool. The current mainstream AI intelligent programming code assistants include Github CopilotX, Codeium, Tabnine, Replit Ghostwriter and Amazon CodeWhisperer.

  • Github CopilotX

5976ecfab7f28ec50f0094a95e6f0b1b.jpeg

Copilot X is an upgrade to Copilot released in 2021. It is connected to GPT-4 and has added functions such as chat and voice. In Copilot X, you only need to "move your mouth" and it can write your code By the way, I also wrote the test cases for you. It can also explain the code snippets that you don’t understand, and let it help you debug directly. It is simply a thoughtful little assistant for programmers.

7a7b4d3f04a6ccfbc0e821f22629c0d6.jpeg

With the release of OpenAI's GPT-4 model, GitHub released a new version of GitHub Copilot X. The AI ​​model of Copilot X uses the latest OpenAI GPT-4. GitHub Copilot X is committed to improving the developer experience and will provide chat and voice interfaces, support pull requests, answer documentation questions, and enable a more personalized developer experience through GPT-4. Using GitHub Copilot X, it can explain the purpose of the code, and when it encounters bugs, let Copilot X try to fix it, and even generate unit tests by the way.

  • Replit Ghostwriter

d0ee0404247b2fe3e7946eb50d69323a.png

Replit Ghostwriter is an artificial intelligence-based code assistance tool that helps developers quickly write, generate, convert, and interpret code, while providing a function to search and import open source code within the editor. Replit is an online integrated development environment (IDE), which supports multiple programming languages, such as Python, JavaScript, Ruby, etc., allowing developers to create, run and share code in the browser. Replit also provides functions such as multi-person collaboration, version control, and cloud deployment, allowing developers to easily build and release applications. Replit AI Ghostwriter is a new feature of Replit that leverages OpenAI's GPT-4 model to provide developers with an AI-powered coding assistance tool.

However, now Baidu Smart Cloud has created a new generation of coding assistance tools based on the Wenxin model - code assistant Comate!

8859597c9c21fd3703e2834b628a3e31.png

During the engineer's development process, Comate can predict the code by reading the declared function name through the context and comment combination code in the development. While allowing to view suggestions and manually edit suggested codes, duplicate codes are automatically filled.

The working principle is to read through the head open source code on the global GitHub repository, collect data and try to find the best code related to it, and continuously train and improve the recommendation accuracy through the returned data. The core capabilities are reflected in single-line recommendation, multi-line recommendation and natural language conversion code.

single line recommendation

93f792d05e68da0834d8e96c1d2f9036.gif

Multi-line recommendation

f4c28c0212ccbf1cc5db64fc4463f7c2.gif

natural language transcoding

3614c9745a75325202f66044084ee5e7.gif

After a lot of internal testing, among the codes suggested by Comate, 30%-50% of the suggested codes are adopted by developers, accounting for more than 10% of the official new codes, and more and more are applied to various product development. Comate supports mainstream IDE frameworks and currently covers 30+ languages, especially in C/C++, Python, Java, Go, PHP, JavaScript and other mainstream languages.

© THE END 

For reprinting, please contact this official account for authorization

bf1e52f8d93c6c68612cbc566e761e87.gif

The Computer Vision Research Institute study group is waiting for you to join!

ABOUT

Institute of Computer Vision

The Institute of Computer Vision is mainly involved in the field of deep learning, and is mainly committed to research directions such as target detection, target tracking, and image segmentation. The research institute always shares the algorithm framework of the latest papers, and the platform focuses on "research" and "practice". In the later stage, we will share the practical process for the corresponding fields, so that everyone can truly experience the real scene of getting rid of the theory, and cultivate the habit of loving programming and brain thinking!

1c0fd48728e666376a727de3cc6d0ff1.png

6cdf53bbc31d5e2bb85dd5ac168c8318.png

b1a540dea9006c2ec42758514a960766.png

3657bfcd5e8f3e73a53f2c5bf99bf995.png

63a30564dde0e5cddcd851d9e0c34cc8.png

Click "Read the original text" to cooperate and consult immediately

Guess you like

Origin blog.csdn.net/gzq0723/article/details/131118364