While researchers at West Virginia University see potential in the latest official ChatGPT add-on, called Code Interpreter, for educational applications, they also found that it is important for using computational methods to address targeted treatments for cancer and genetic diseases. For scientists with biological data, there are limitations to the use of this plugin.
"The Code Interpreter is a promising tool, especially in education, as it makes programming in STEM fields more accessible to students." Assistant Professor, Department of Microbiology, Immunology and Cell Biology, West Virginia University School of Medicine , and director of the Bioinformatics Core, Gangqing "Michael" Hu said: "However, it does not provide all the functions required in the field of bioinformatics. These are problems that can be solved through technical improvements. Code interpreters may be used in the future." Expand its application areas, including bioinformatics, finance, and economics.”
Since its release in December 2022, ChatGPT, the popular artificial intelligence chatbot, has attracted business, education and public attention. However, it does not fully meet the needs of those working in biomedical research, including the intersectional field of bioinformatics, and these scientists are eagerly looking forward to OpenAI's code interpreter plug-in, hoping that it will fill these gaps.
Hu and his team tested the code interpreter's performance on various tasks to evaluate its properties. Their findings, published in the Annals of Biomedical Engineering, show that while the plugin excels in some ways, it still has some limitations.
For example, someone without a scientific background can easily get in touch with programming or computer programming through a code interpreter. It is also cost-effective and stimulates students' curiosity to explore data analysis, increasing their interest in learning, Hu said. He pointed out that users need to understand how to interpret the data, identify whether the results are accurate, and know how to interact with the chatbot.
Bioinformaticians rely on precise programming, computer software programs and Internet access to store, analyze and interpret biological data such as DNA and the human genome for the advancement of modern medicine.
While specific improvements to bioinformatics are needed, Hu said code interpreters help users determine the accuracy of answers and whether there is a so-called "illusion," or fictional answer, which in some cases may misleading. .
"People know that ChatGPT can do a lot of impressive things, but it's not very good at providing citations or references to support its answers. If asked where it came from to support a response, it might start making up references," Hu explained. "Code interpreters provide a solution to minimize hallucinations. For problems that can be solved programmatically, the code itself can serve as a source or citation. This is an important advance."
Hu's collaborators include postdoc Lei Wang from West Virginia University's Department of Microbiology, Immunology and Cell Biology; Xijin Ge from South Dakota State University; and Li Liu from Arizona State University.
The team found good results in the code interpreter's ability to turn data into charts and graphs.
Suggested upgrades to the code interpreter include providing Internet access to download genomic data, installing bioinformatics-specific software, expanding storage capacity, and supporting more programming languages. Additionally, the researchers found that compliance with privacy and security application regulations such as HIPAA is required.
While testing the data analysis, they found some limitations. The plugin supports only one computer program, Python ( Python Practical Exercise for Bioinformatics Analysis 3 | Video 21 ), and only supports some software packages dedicated to bioinformatics. Also, it cannot access data on the internet and cannot handle large files. This drawing function is really powerful, the first experience of Muggle's ChatGPT 4.0
Three examples of using the code interpreter to create graphs
"It only allows processing of files around 100 megabytes or so, but we're processing files in the gigabyte range," Hu said. "Also, it doesn't support the parallel processing required for large datasets, resulting in slower performance." Hu said he plans to use the plugin in next year's course, although he expects more upgrades to the code interpreter, To help students understand data visualization. "Artificial intelligence is a rapidly developing field. I hope that by that time, OpenAI can overcome some limitations so that it can be used for broad bioinformatics programming." Finally, Hu said he will continue to monitor and test new AI Programming and functionality, as there are still many innovative uses waiting to be discovered in this space.
Shengxin Baodian: After ChatGPT gives prompts and codes ( be lazy, use ChatGPT to help me write a piece of biological information code ), big data still needs to be run locally, otherwise the speed of uploading and downloading and analysis will take up too many resources.