[Xiao Muxue NLP] Summary of AI-assisted programming tools

1 Introduction

AI-assisted programming tools are tools that use artificial intelligence technology to help programmers write and maintain code more efficiently. These tools use machine learning algorithms to analyze code bases, learn programming patterns and preferences, and automate programming tasks, reducing programmer effort and errors.

2. Domestic

2.1 aiXcoder

aiXcoder: China's first intelligent software development tool based on deep learning, which uses AI technology to realize functions such as automatic code generation, automatic code completion, and intelligent code search, improving developer development efficiency and code quality.

Official website address:
https://aixcoder.com/#/

Insert image description here

  • Token-level code generation and completion: Based on local services, it supports automatic recommendation of codes for single or multiple Tokens
  • Line-level code generation and completion: Based on cloud services, it supports automatic generation or completion of entire lines of code.
  • Method-level code generation and completion: Based on cloud services, it supports the generation or completion of method-level code based on natural language function descriptions and context.

2.1.1 Tool features

  • Various mainstream programming languages

    • Supports Java, Python, C#, C/C++, JavaScript, TypeScript, Go and other programming languages ​​(the cloud intelligent programming service currently only supports Java language)
    • aiXcoder for enterprises can customize new programming languages ​​according to enterprise needs
  • Various mainstream IDEs

    • Compatible with IntelliJ IDEA, CLion, GoLand, PyCharm, WebStorm, Visual Studio Code, Eclipse and other IDEs (the cloud intelligent programming service is currently only compatible with IntelliJ IDEA)
    • aiXcoder for enterprises can customize plug-in customization functions according to enterprise needs

Insert image description here

2.1.2 Deployment method

  • Extremely fast local model
    You can run aiXcoder's deep learning model in a "completely private" environment. The model will be downloaded to your computer and the data will be queried locally, allowing for rapid code completion. This mode supports offline use.

  • Large models in the cloud
    Deep learning models deployed in the cloud require a good network to run, which can help you implement intelligent programming functions such as line-level code completion and method-level code generation.

2.1.3 Usage fees

aiXcoder Enterprise Edition adopts a flexible pricing model, which specifically includes license fees and model customization fees. The specific amount adopts a flexible pricing model based on the size of the enterprise. Please contact business personnel for details.

2.1.4 Code testing

2.1.4.1 Code search engine

https://codesearch.aixcoder.com/#/
Insert image description here

2.1.4.2 Online experience

https://aixcoder.com/nl2code/
Insert image description here

2.2 CodeGeeX

https://codegeex.cn/zh-CN

CodeGeeX is a multi-programming language code generation pre-trained model with 13 billion parameters. CodeGeeX is implemented using Huawei's MindSpore framework and is trained on 192 nodes (a total of 1,536 domestic Ascend 910 AI processors) in Pengcheng Lab's "Pengcheng Cloud Brain II". As of June 22, 2022, CodeGeeX was pre-trained on code corpora (>850 billion Tokens) in more than 20 programming languages ​​over two months.
Simply put, CodeGeeX is a code generation tool powered by artificial intelligence to help you write code quickly.

Insert image description here

CodeGeeX2 is the second generation model of the multilingual code generation model CodeGeeX (KDD'23). Different from the first generation of CodeGeeX (completely trained on the domestic Huawei Ascend chip platform), CodeGeeX2 is based on the ChatGLM2 architecture and adds code pre-training. Thanks to the better performance of ChatGLM2, CodeGeeX2 has achieved performance improvements in multiple indicators (+107% > CodeGeeX ; Only 6 billion parameters, which is nearly 10% of StarCoder-15B with more than 15 billion parameters)

2.2.1 Tool features

  • High-precision code generation: supports the generation of codes in a variety of mainstream programming languages ​​such as Python, C++, Java, JavaScript, and Go. It achieves a 47%~60% solution rate on the HumanEval-X code generation task, which is better than other open source baseline models. average performance.
  • Cross-language code translation: Supports automatic translation and conversion of code snippets between different programming languages. The translation results are highly accurate and surpass other baseline models in the HumanEval-X code translation task.
  • Automatic programming plug-in: CodeGeeX plug-in is now available on the VSCode plug-in market (completely free). Users can use its powerful few-sample generation capabilities to customize code generation styles and capabilities to better assist code writing.
  • The model is cross-platform and open source: All codes and model weights are open source and available for research purposes. CodeGeeX supports both Ascend and NVIDIA platforms, and can implement inference on a single Ascend 910 or NVIDIA V100/A100.

Insert image description here
Architecture: CodeGeeX is a large-scale pre-trained programming language model based on transformers. It is a left-to-right generative autoregressive decoder that takes a code or natural language identifier (token) as input and predicts the probability distribution of the next identifier. CodeGeeX contains 40 transformer layers. The hidden layer dimension of each self-attention block is 5120, the feedforward layer dimension is 20480, and the total parameter amount is 13 billion. The maximum sequence length supported by the model is 2048.

Corpus: The training corpus of CodeGeeX consists of two parts. The first part is the open source code data set, The Pile and CodeParrot. The Pile contains a selection of open source repositories with over 100 stars on GitHub, from which we selected code in 23 programming languages. The second part is supplementary data, crawling Python, Java, and C++ codes directly from the GitHub open source warehouse;

CodeGeeX supports a variety of mainstream IDEs, such as VS Code, IntelliJ IDEA, PyCharm, Vim, etc.,
and also supports Python, Java, C++/C, JavaScript, Go and other languages.

  • Support IDEs
    Insert image description here
  • Supported languages
    Insert image description here

2.2.2 Deployment method

An online way to provide vscode plug-ins for individuals.
Provide CodeGeeX privatized deployment services for enterprises.

2.2.3 Usage fees

The CodeGeeX plugin does not require any of the above. All you need is to go to the plugin store and download it. The CodeGeeX plug-in is completely free for individual users.

Insert image description here

2.2.4 Code testing

Based on CodeGeeX, we have developed a free VS Code plug-in (search "CodeGeeX" in the plug-in market to download) to assist multi-language programming development.

Insert image description here

2.3 Alibaba Cloud AI Coding Assistant(cosy)

https://developer.aliyun.com/tool/cosy
Alibaba Cloud AI Coding Assistant is an AI programming assistant that provides intelligent code completion and code sample search capabilities, helping you write faster and more efficiently. Produce high-quality code.
Insert image description here

3. Abroad

Several current mainstream AI intelligent programming code assistants include: Github Copilot, Codeium, Tabnine, Replit Ghostwriter and Amazon CodeWhisperer.

Insert image description here

3.1 GitHub Copilot

https://github.com/features/copilot

In June 2022, GitHub Copilot will be officially released to the public. Developers around the world are ecstatic, having been waiting for this day since the beta launch in 2021. The wait proved to be worth it. Copilot achieves its goal of helping developers reduce work while accelerating their coding process. It's almost perfect, except for one thing - Copilot charges.

Insert image description here

Insert image description here

3.2 codeium

https://codeium.com/

Codeium is a free AI code acceleration toolkit built on cutting-edge artificial intelligence technology. It provides code completion, smart search, and AI chat in more than 20 languages. Codeium works with all popular integrated development environments (IDEs), including Visual Studio Code, IntelliJ IDEA, and Eclipse.

Insert image description here

3.3 Amazon CodeWhisperer

https://aws.amazon.com/cn/codewhisperer/

Amazon CodeWhisperer is trained on billions of lines of Amazon and publicly available code to understand comments written in natural language (English) and can generate multiple code suggestions in real time to improve developer productivity. The service provides suggestions for complete functional and logical code blocks (typically consisting of up to 10–15 lines of code) directly in the integrated development environment (IDE) code editor. The generated code is similar to how you would write it, conforming to your style and naming conventions. You can quickly accept the top suggestion (Tab key), view more suggestions (arrow keys), or continue writing your own code. Before accepting a code suggestion, be sure to review it and may need to edit it to ensure it matches exactly what you intended. CodeWhisperer even provides its own suggestions for completing comments as you type.

Insert image description here

Amazon CodeWhisperer provides developers with real-time code suggestions directly in the integrated development environment (IDE). Individual developers can use CodeWhisperer for free. Organizations pay a fixed subscription fee on a per-user-per-month basis to use CodeWhisperer, with no upfront costs or long-term commitments.

3.4 Tabnine

https://www.tabnine.com/

TabNine is a relatively young development tool. It was quite impressive when it was first released. Not long after OpenAI had just open-sourced the GPT-2 model, TabNine conducted secondary training on massive code data based on the GPT-2 model and created a The deep learning engine for code can intelligently identify the above information of the code and provide long sequence code completion results. Currently, it has been acquired by Codota, which mainly promotes this tool and claims to support all mainstream development languages.

tabnine is divided into basic version, enhanced version and enterprise version. The free version only provides basic completion functions, while the paid Pro version has better completion effects.
Insert image description here

3.5 Replit Ghostwriter

https://replit.com/site/ghostwriter

AI-assisted programming has circled a new unicorn. It's called Replit, a cloud development platform founded in 2016. It just launched its own Coplipot-like tool Ghostwriter last year. Replit itself is an online integrated development environment that supports more than 50 development languages. It can help novice developers avoid complex environment deployment and start programming directly; it is also suitable for experienced developers, and can carry out collaborative programming, construction and testing of various applications.

Currently, Ghostwriter costs 1,000 Cycles per month ($10 USD/month).
Ghostwriter is also available through our Pro plan.
Of course, it is not free. You need to recharge $10 per month or exchange it for 1,000 Cycles (a virtual code released by the Replit platform currency).

Insert image description here

3.6 Microsoft IntelliCode

https://learn.microsoft.com/zh-cn/visualstudio/intellicode/intellicode-visual-studio

Visual Studio 2022 comes with IntelliCode, an AI-assisted code companion that allows developers to enter less code and improve efficiency. IntelliCode can complete an entire line of code, allowing users to write reliable code with just two presses of the Tab key. IntelliCode can also spot duplicate edits and suggest fixes where similar patterns exist throughout the code base.

3.7 OpenAI Codex

https://openai.com/blog/openai-codex/

The Codex family of models is a descendant of the GPT-3 family, trained on natural language and billions of lines of code. This model series is proficient in more than a dozen languages, including C#, JavaScript, Go, Perl, PHP, Ruby, Swift, TypeScript, SQL, and even Shell, but is best at Python.

OpenAI Codex is a descendant of GPT-3; its training data contains natural language and billions of lines of source code from public sources, including code in public GitHub repositories. OpenAI Codex is most powerful in Python, but it's also proficient in more than a dozen languages, including JavaScript, Go, Perl, PHP, Ruby, Swift, and TypeScript, and even Shell. Its Python code memory is 14KB, while GPT-3 only has 4KB, so it can consider more than 3 times the contextual information when performing any task.

Conclusion

如果您觉得该方法或代码有一点点用处,可以给作者点个赞,或打赏杯咖啡;╮( ̄▽ ̄)╭
如果您感觉方法或代码不咋地//(ㄒoㄒ)// ,就在评论处留言,作者继续改进;o_O???
如果您需要相关功能的代码定制化开发,可以留言私信作者;(✿◡‿◡)
感谢各位大佬童鞋们的支持!( ´ ▽´ )ノ ( ´ ▽´)! ! !

Guess you like

Origin blog.csdn.net/hhy321/article/details/132916887