design\games\ vsCoder Online IDE + Local Deployment of AI Coding Assistant Tabby

Coder + Tabby AI assistant for code farmers

Primer

In the digital age, artificial intelligence (AI) is penetrating various industries at an alarming rate. For code farmers, AI has become an indispensable assistant in their daily work. However, today I would like to introduce a unique assistant to you, combining the open source vscode web version code-server and the open source AI assistant Tabby, to provide a feature-rich and customizable programming environment to help you better code.

function points

Coder

  1. First, VS Code is a popular integrated development environment (IDE) developed by Microsoft that is widely used to write and debug applications in various programming languages. And code-server is an open source project of VS Code, which provides a way to run the VS Code editor as a web page. This means you can access it through your browser, no matter which device you are on, as long as you have a stable internet connection.
  2. By deploying code-server to your own server, you can enjoy all the functions of VS Code, including code editing, debugging, version control, etc. You can customize extensions and configurations to meet your individual programming needs. And, since it's running on the server, you can save and sync your code at any time, allowing you to seamlessly switch between different devices.

Tabby

  1. Next up is Tabby, the AI ​​assistant. Tabby is an open-source AI assistant designed to assist the programming and development process. It can be integrated with code-server to provide you with more advanced programming support and advice. Tabby can provide smart suggestions, auto-completion, and error checking by analyzing code structure, recognizing patterns, and context. It can also provide you with links to documentation and reference materials to help you solve problems and learn new concepts faster.

Coder+Tabby

  1. Combine code-server and Tabby, you can get a powerful and personalized development environment. You can build a complete programming workbench on your own server, which is not limited by devices and can be accessed anytime, anywhere. With Tabby's smart support, you can write code more efficiently and benefit from its suggestions and hints.
  2. This assistant is very useful for individual developers, team collaboration and remote work. It provides powerful tools and intelligent assistance functions, which can greatly improve development efficiency and code quality. Moreover, since it is an open source project, you can customize and extend it according to your own needs.

core needs

For a more flexible and cost-effective solution, I chose to deploy code-server and Tabby on my own server. Here are some reasons:

  1. Idle Mac Mini server : I have an idle Mac Mini server, and by deploying code-server and Tabby on it, I can take advantage of this resource and turn it into a powerful programming environment.
  2. Lack of Copilot plugin : Since there is no Copilot plugin on code-server, this may limit the convenience and efficiency of my code writing process.
  3. Billing issues : The use of GitHub Copilot, Cursor, and ChatGPT Plus all require fees, which are $10, $16, and $20 per month, respectively. By deploying the environment myself, I can avoid these additional costs.
  4. Self-deployment : By deploying Tabby myself, I can ensure that the code will not be uploaded to someone else's server to generate code hints, which improves data security and privacy protection.
  5. Continuous reminder : Compared with other paid plans, Tabby has lightning-fast response speed and continuous reminder ability.
    • Although it is based on GPT-2, and the accuracy rate may be lower, it has a faster prompt speed and no interface call frequency limit. You defined the spawn time as 150ms, which means it can provide prompts continuously at a fast rate. In this case, speed takes the place of quality considerations.
    • In some cases, quick prompt feedback may be more important than completely accurate results. This is especially useful in scenarios where developers want to get ideas quickly, iterate, or require frequent references and hints. Despite the low accuracy rate, Tabby provides a fast and smooth development experience through constant quick prompts.
    • This can be very valuable to developers as it can provide immediate feedback and inspiration without waiting for long processing times. This trade-off can be determined based on individual preferences and needs, and for developers looking for speed and iteration, Tabby's speed advantage may be an attractive feature.
  6. Provide first-class support for model fine-tuning : Although Tabby has not yet provided an API for model fine-tuning, it can be expected that it may provide support in this regard in the future. This will allow me to personalize the model to my needs.
  7. The test was successful: I personally tested the Java language on Tabby's official website, and let it generate a verification code for the charging fee interface. The test was very successful, proving Tabby's reliability in providing accurate guidance.
    By deploying code-server and Tabby on my own server, I can get a feature-rich and customized programming environment. This not only improves coding efficiency and quality, but also avoids additional costs and data security risks . It's an ideal solution for both individual developers and team collaborations.

Deployment Guide

Warm reminder, be careful not to worry about installing, first read the following, read the following, below, text! ! !

  1. Install brew : First, you need to install Homebrew (brew), a popular package manager, on your Mac. You can find installation instructions on Homebrew's official website (https://brew.sh).

  2. Install code-server : Using brew, you can install code-server by running:

    brew install code-server
    

    This will download and install code-server from the Homebrew repository.

  3. Install Docker : Next, you need to install Docker, which is an open source containerization platform. You can visit Docker's official website (https://www.docker.com) and follow their guide to install Docker.

  4. Configure and start code-server : Once code-server is installed, you can start it with the following command:

    code-server
    # 或者使用 launchctl load xxx 忘记了来设置开机启动,MAC 设置为通电启动
    

    This will start code-server on the default port (usually 127.0.0.1:8080), and access it in a browser.

  5. Install the plugins required for Java development : For Java development, you need to install the plugins for Java on the code-server. One of the commonly used plug-ins is "Extension Pack for Java", which is an extension pack that contains multiple Java development-related plug-ins. You can search and install the plugin in the VS Code Marketplace (https://marketplace.visualstudio.com). Once installed, it provides the tools and features needed for Java development.

  6. Install the Tabby plugin : In order to use Tabby on the code-server, you need to install the Tabby plugin. The Tabby plugin will provide you with Tabby integration, allowing you to access Tabby's functionality in code-server. You can search for the "Tabby" plugin in the VS Code Marketplace and follow the installation instructions to install it.

  7. You don't have M1/M2, use Docker to deploy Tabby on your own server: In order to use Docker to deploy Tabby, you need to build a Docker image from Tabby's official source code. You can find the source code and build instructions in Tabby's GitHub repository (https://github.com/tabby-lang/tabby).

    • If you are not MAC
      docker run --name tabby -it -d --network=host -v  /Users/apple/data:/data -v /Users/apple/data/hf_cache:/home/app/.cache/huggingface tabbyml/tabby serve --model TabbyML/SantaCoder-1B
      
  8. You have M1/M2 : Mac people provide a compiled binary package, download and use directly

  9. You have now "successfully" installed brew, code-server and Docker on your Mac, and deployed Tabby. You can access this complete development environment by visiting the code-server URL and specifying the Tabby port.

But your Tabby doesn't work

  • If you have an idle Mac Mini M2 Pro or Mac Mini M2 Ultra, congratulations, you can solve most of the problems with hardware upgrades without any extra effort.
  • But if you only have a base Mac Mini M2 with lower specs, then we're the ones looking to work our way out of the way, and we're going to have to keep messing around.

continue to toss

I firmly believe that we should be the drivers of AI, not be replaced. Through deep interaction with AI technology, we should better control it and use it as a powerful assistant for my programming.

While following the steps mentioned before, you may encounter issues with the Tabby backend not functioning properly.

  1. I found the pre-trained language model used by Tabby for the Tabby code assist tool at https://huggingface.co/TabbyML. This discovery is very valuable to me because Hugging Face provides many excellent pre-trained models that can be used for various AI tasks.
  2. There is very limited information about Tabby on the web, and I had some difficulty trying to compile Tabby, so the blogger decided to try Docker to deploy Tabby. This approach simplifies the deployment process and provides better portability.
  3. In the domestic network environment, failure to download the Hugging Face model in the container is a common problem. To solve this problem, you can try to use a proxy server or accelerator to improve download speed and stability. In addition, you can also look for domestic mirror sources, or try to use the offline model provided by Hugging Face, download it to the local and use it by modifying the hosts to point the target URL to the local.
  4. Regarding N card and GPU acceleration and dependency installation, the execution of Tabby requires NVIDIA GPU, and if you need to use the GPU version, you need to install the corresponding NVIDIA CUDA Toolkit additionally. Please make sure your system has the correct hardware and drivers, and follow the relevant installation guide. Also, read Tabby's documentation for GPU acceleration requirements and dependencies, and make sure your environment is set up correctly.

Performance Testing

The last is the conclusion. Since the blogger does not have an additional Nvidia GPU computer, my computer only has an MX250 GPU, and there are important learning materials in it, so I chose to rent a GPU cloud server for actual testing.

  1. First, I deployed the CPU version of Tabby and used an 8-core 32GB memory cloud server. By using the ab tool of apache2 to test, the response speed of each request is about 2.5 seconds. The CPU load exceeds 90%, and the memory only uses 1.6GB. At the same time, when the number of concurrency is 10, it is stuck. When I try to use the code-server to connect to the Tabby service, it is almost impossible to work normally, the code hints are frequently dropped, and occasionally it can succeed once.
  2. Then, I switched to a cloud server with 6 cores and 64GB memory. The average request completion time was 3 seconds, and the main bottleneck was CPU performance.
  3. Finally, I tried the GPU version, using a T4 GPU. According to the information on the Internet, the Tesla T4 graphics card is equivalent to the GTX 1060 graphics card (which belongs to the medium performance of the entry-level graphics card). After testing, the response time is 200 milliseconds. Even if 20 requests are executed under 10 concurrent requests, each request still only takes 200 milliseconds. It can be seen that this is executed serially (200ms/request is already the upper limit of processing). One of the CPU's load moment only reached about 60%. When I use code-server to connect to Tabby service, it works very smooth, fast, and very stable.

Conclusion and Deployment Recommendations

Finally, the conclusion: If you want to have a set of unique AI coding assistants with intranet security, that is, code-server + tabby, you need an M2 (the official website indicates that M1/M2 is suitable for personal use), or a performance The server is equivalent to GTX 1060 (the performance of T4 cloud server is equivalent to GTX 1650).

For graphics card recommendations, individuals may not have a deep understanding of it. According to the information on the Internet, the blogger found some mini hosts with prices below 5k RMB:

  • Such as RTX 2060 6G, RTX 3050 8G, GTX 1060, M2, GTX 1650 4G, etc.
  • According to the survey of bloggers, their CUDA core numbers are respectively
  • 1920, generally around 2500, 1280, more than 300, 896
  • The M2 is a chip integrated into an Apple computer, and its architecture and CUDA core count do not lend itself to direct comparison. The M2 chip has a unified built-in neural network hardware accelerator, but it does not have a public CUDA core number indicator
  • Personally feel that except GTX 1650, other graphics cards should be able to do the job.
    • The performance of the GTX 1650 is more difficult to determine, it may be slightly lower than the GTX 1060.
    • The response time is about 0.5 seconds (GTX 1060 is 0.25 seconds, which is 3 seconds, and then the performance of 1650 is 25% lower, it has not been measured).
  • This is just a blogger's personal suggestion, depending on everyone's needs and opinions.

In addition, bloggers personally like to play Blender, so they may be more inclined to N cards. The advantage of M2 is low power consumption, it feels good, and it is quieter and saves power.

All that said, choosing the right graphics card or server for your needs is a personal decision. Different graphics cards have different performance characteristics and advantages. It is recommended that you make a choice according to your budget, needs and preferences. Note that technology and markets are constantly evolving, so further research and consultation is best before making a decision.

Thank you for your well wishes! Hope you all have fun and get the help you need while using the code-server and tabbyml-demo environment . If you have any unsatisfactory test results or other problems, you can ask for help in the comment area, bloggers may not be able to solve all problems directly. I wish you all the best use of these tools to improve your productivity and coding experience!

Guess you like

Origin blog.csdn.net/u012296499/article/details/131156493
Recommended