Click on the blue word to follow us
Follow and star
never get lost
Institute of Computer Vision
Public ID|Computer Vision Research Institute
Learning group|Scan the QR code to get the joining method on the homepage
Computer Vision Research Institute column
Column of Computer Vision Institute
On June 6, at the Wenxin Large-scale Model Technology Exchange Conference (Chengdu), Baidu Smart Cloud launched the "Comate" code assistant, and officially opened invitation testing. With the help of the understanding and reasoning capabilities of Wenxin's big model, "Comate" can realize fast code completion, natural language code recommendation, automatic code error finding, and comprehensively improve the developer's R&D efficiency. In the future, developers can use the "Comate" code assistant in mainstream development software through plug-ins and other forms. There are already many code assistant tools on the market, will Baidu stand out?
01
background
As early as June 2021, in order to meet the future large-scale model training tasks, Baidu Smart Cloud began to plan the construction of a new high-performance GPU cluster. Together with NVIDIA, it completed the IB network architecture design that can accommodate 10,000 cards or more. The nodes in the cluster Each GPU card in the room is connected through the IB network, and the cluster construction will be completed in April 2022, providing single-cluster EFLOPS-level computing power.
In March 2023, Wenxin Yiyan was born on this high-performance cluster, and iteratively developed new capabilities. At present, the size of this cluster is still expanding. Dr. Lai Junjie, General Manager of Solutions and Engineering, NVIDIA China: GPU clusters interconnected by high-speed IB network are the key infrastructure in the era of large models. The largest high-performance GPU/IB cluster in the domestic cloud computing market jointly built by NVIDIA and Baidu Smart Cloud will accelerate Baidu's breakthrough in the field of large models.
Covering the whole life cycle of large models - more comprehensive and more comprehensive
Provide comprehensive functional services for data labeling, model training and evaluation, reasoning services and application integration
Significantly improved training and inference performance - more efficient and more efficient
The training performance of MLPerf list is world-leading, and the acceleration capability of distributed parallel training of 100 billion models and the utilization rate of computing power have been greatly improved
Rapid application orchestration and plug-in integration - more open and more open
Preset Baidu Wenxin large models and third-party large models, support flexible arrangement of plug-ins and applications, and help large models to be applied in multiple scenarios
Built-in sensitive word filtering - safer and more secure
Perfect authentication and flow control security mechanism, built-in sensitive word filtering, double guarantee of machine review and human review
Built-in Wenxin large model base
technology leadership
Knowledge-enhanced large model, unified paradigm supports multiple types of downstream tasks
Advanced parallel strategy supports large model training, compression and deployment
Controllable and reliable language understanding and generation capabilities
Full scene coverage
Support dialogue interaction, free question and answer, copywriting and other capabilities
Covering energy, finance, aerospace, industry, media and other fields
Low threshold and easy to use
One line of code to call the service
One-click automatic model fine-tuning
A small amount of data to complete the implementation of multi-scenario AI applications
Real and landable
Provide enterprise-level one-stop customer service
Get through the four-tier architecture of chip + platform + model + application
Cooperate with multiple partners to achieve end-to-end application landing
02
Large Model Code Assistant
With the increasing demand for digital transformation, more and more applications of AI in enterprises, high threshold for AI development, complex and diverse application scenarios, and dependence on scene annotation data have become challenges for the large-scale implementation of AI, while pre-training large models The emergence of artificial intelligence has brought new opportunities and hopes.
As an important starting point for the government and enterprises to promote the development of the artificial intelligence industry, large models have shown significant advantages and great potential in the generalization, versatility, and migration of AI tasks such as recognition, understanding, decision-making, and generation. It is no longer a fantasy if programmers have a code assistant who can easily and accurately assist in completing some repetitive, simple, and trivial tasks.
Now, more and more developers need to use this must-have tool. The current mainstream AI intelligent programming code assistants include Github CopilotX, Codeium, Tabnine, Replit Ghostwriter and Amazon CodeWhisperer.
Github CopilotX
Copilot X is an upgrade to Copilot released in 2021. It is connected to GPT-4 and has added functions such as chat and voice. In Copilot X, you only need to "move your mouth" and it can write your code By the way, I also wrote the test cases for you. It can also explain the code snippets that you don’t understand, and let it help you debug directly. It is simply a thoughtful little assistant for programmers.
With the release of OpenAI's GPT-4 model, GitHub released a new version of GitHub Copilot X. The AI model of Copilot X uses the latest OpenAI GPT-4. GitHub Copilot X is committed to improving the developer experience and will provide chat and voice interfaces, support pull requests, answer documentation questions, and enable a more personalized developer experience through GPT-4. Using GitHub Copilot X, it can explain the purpose of the code, and when it encounters bugs, let Copilot X try to fix it, and even generate unit tests by the way.
Replit Ghostwriter
Replit Ghostwriter is an artificial intelligence-based code assistance tool that helps developers quickly write, generate, convert, and interpret code, while providing a function to search and import open source code within the editor. Replit is an online integrated development environment (IDE), which supports multiple programming languages, such as Python, JavaScript, Ruby, etc., allowing developers to create, run and share code in the browser. Replit also provides functions such as multi-person collaboration, version control, and cloud deployment, allowing developers to easily build and release applications. Replit AI Ghostwriter is a new feature of Replit that leverages OpenAI's GPT-4 model to provide developers with an AI-powered coding assistance tool.
However, now Baidu Smart Cloud has created a new generation of coding assistance tools based on the Wenxin model - code assistant Comate!
During the engineer's development process, Comate can predict the code by reading the declared function name through the context and comment combination code in the development. While allowing to view suggestions and manually edit suggested codes, duplicate codes are automatically filled.
The working principle is to read through the head open source code on the global GitHub repository, collect data and try to find the best code related to it, and continuously train and improve the recommendation accuracy through the returned data. The core capabilities are reflected in single-line recommendation, multi-line recommendation and natural language conversion code.
single line recommendation
Multi-line recommendation
natural language transcoding
After a lot of internal testing, among the codes suggested by Comate, 30%-50% of the suggested codes are adopted by developers, accounting for more than 10% of the official new codes, and more and more are applied to various product development. Comate supports mainstream IDE frameworks and currently covers 30+ languages, especially in C/C++, Python, Java, Go, PHP, JavaScript and other mainstream languages.
© THE END
For reprinting, please contact this official account for authorization
The Computer Vision Research Institute study group is waiting for you to join!
ABOUT
Institute of Computer Vision
The Institute of Computer Vision is mainly involved in the field of deep learning, and is mainly committed to research directions such as target detection, target tracking, and image segmentation. The research institute always shares the algorithm framework of the latest papers, and the platform focuses on "research" and "practice". In the later stage, we will share the practical process for the corresponding fields, so that everyone can truly experience the real scene of getting rid of the theory, and cultivate the habit of loving programming and brain thinking!
Click "Read the original text" to cooperate and consult immediately