How does ChatGPT apply in specific scenarios of data governance?

Since ChatGPT became popular, there have been waves of explosive news in the technology circle, making people dizzy. Suddenly, there is a feeling of "one day in the world, one year for AI". There were a few big news last week, let me share with you:

  • New Bing is open to all users, just register and use it
  • On Wednesday, Google released ChatGPT's rival product Bard
  • GitHub launches GitHub Copilot X on Thursday
  • On Friday, the OpenAI team launched the "ChatGPT Plug-in System"
  • ……

The recent information explosion has become more and more frequent and intense, which has brought me an unprecedented shock, without giving people any chance to breathe. Thinking about the data industry I am in, this year’s policy of building a digital China is a major benefit, but the explosion of artificial intelligence applications represented by ChatGPT makes people have to worry about where to go?


1. What kind of impact does ChatGPT bring to the data industry?

ChatGPT is the product of the rapid development of data science and artificial intelligence. There is no doubt that it will have a huge impact on the data industry.

The popularity of ChatGPT has made the industry realize that data is the core element of corporate decision-making, and the status of data has become more important. How to obtain and accumulate more valuable data, and how to create more value by mining and utilizing data has become the most concerned thing for business owners. It can be foreseen that data scientists, data analysts, and machine learning engineers will surely become sought-after resources, and those who have relevant human resources must be well grasped.

ChatGPT's powerful automated natural language processing capabilities, data analysis and mining capabilities, and data generation capabilities are bound to replace some practitioners who mechanically repeat extremely time-consuming basic data processing jobs. This is a cruel reality. Condition.

ChatGPT is deeply affecting this industry step by step.


2. What specific data business scenarios can ChatGPT be applied to?

ChatGPT's powerful natural language processing capabilities and text generation capabilities provide new possibilities and opportunities for many data business scenarios. Data processing is one of them, and it is very easy to implement. Specific scenarios may include:

  • Data Quality Management: Let ChatGPT analyze data fields, text content, etc. to understand data quality issues such as missing data, inconsistent data formats, wrong data types, etc.
  • Metadata management: Let ChatGPT generate metadata descriptions, such as the name, abstract, classification, source, version, etc. of the dataset. This can help organizations better manage and understand datasets.
  • Data classification and labeling: Let ChatGPT automatically classify and label data, such as topic classification, entity recognition, etc. for text data. This can help organizations better organize and manage data.
  • Data security and privacy: Let ChatGPT analyze sensitive information in data, such as personally identifiable information, financial data, etc., and help organizations take corresponding measures, such as encryption, authorization, etc., to ensure data security and privacy.
  • Data dictionary and glossary management: Let ChatGPT generate data dictionaries and glossaries to better understand and describe data. This can help organizations better manage data and facilitate the sharing and exchange of data.


3. With ChatGPT, do I still need to raise so many technologies?

Although ChatGPT is powerful and some types of work will be greatly affected, in practical applications, other professionals are needed to complete key tasks and will not be completely replaced.

Technical personnel and types of work related to data governance that may be affected

  1. Data classification and labeling personnel: ChatGPT can be used for data classification and labeling data, so some tasks that require manual classification and labeling may be automated.
  2. Data entry personnel: ChatGPT can recognize text fields in forms and auto-fill form data. Might reduce the need for some data entry staff.
  3. Automated test engineers: ChatGPT can automatically test and verify the accuracy and quality of natural language text, which may reduce the need for some automated test engineers.

Technical personnel and types of work related to data governance that will not be affected for the time being

  1. Data collectors: Data collectors are professionals who are responsible for collecting and curating data, and they can use various tools and techniques to obtain data from different sources such as social media, sensors, websites, etc.
  2. Data stewards: Data stewards are professionals responsible for managing data. They can be responsible for formulating data management strategies, developing data security measures, ensuring data quality, monitoring data processes, etc.
  3. Data Analyst: A data analyst is a professional who analyzes data by using a variety of tools and techniques to identify trends, correlations, and anomalies, as well as uncover insights and trends behind the data.


4. What auxiliary functions can ChatGPT provide to the data industry at present?

ChatGPT can provide a variety of auxiliary functions for the data industry, including but not limited to the following aspects:

  1. Natural language processing: As a powerful natural language processing tool, ChatGPT can process and analyze text data, such as sentiment analysis, topic classification, text generation, machine translation, etc., thereby helping the data industry to better understand and utilize text data.
  2. Speech recognition: ChatGPT can be used for speech recognition and transcription, helping the data industry to better process and analyze speech data.
  3. Data annotation and classification: ChatGPT can help automate the annotation and classification of text data, such as sentiment classification, topic classification, and entity recognition. This helps improve the quality and accuracy of text data, and can speed up data processing and analysis.
  4. Data cleaning: ChatGPT can be used for cleaning and preprocessing of text data, thus helping the data industry to better process and manage large amounts of text data.
  5. Data mining: ChatGPT can be used to mine and analyze key information and knowledge in text data, thereby helping the data industry to better explore and utilize the value of data.
  6. Data visualization: ChatGPT can generate natural language text and help the data industry better display and communicate the results and meaning of data.

In general, ChatGPT, as a powerful natural language processing tool and machine learning model, can be applied in various links such as data labeling and classification, data cleaning, data mining and data visualization, and can improve the efficiency and accuracy of the data industry. play an important auxiliary role.


5. After using ChatGPT in depth, my feelings

ChatGPT is really powerful and has to be convinced. But its limitations are also obvious, that is, it lacks creativity and imagination. It cannot "really" create new things, but only "repackages existing information and knowledge" through a lot of content training. It doesn't mean that it knows everything. When I try it out, I often receive plausible answers made up by it.

After discussing with the team members, we believe that ChatGPT is not enough to deeply access the current workflow at this stage. Not only is there a word limit for data input and output, but also its data security and the stability of transmission services have yet to be verified.

However, some work can be done with the assistance of ChatGPT. The simplest and most commonly used scenario is to assist data management personnel to generate SQL query statements and assist in checking grammatical errors in SQL query statements, which can greatly improve work efficiency. . In addition, ChatGPT can also be used to complete data sorting, data missing value filling, and data difference.

ChatGPT is still evolving, and the internal test of the plug-in system has already caused a commotion. When it is fully opened, I don’t know how much commotion will be caused. Every time technology is updated, some jobs will disappear, and a batch of new jobs will be created at the same time. There is no need to resist, and resistance often does not work.

No matter how intelligent AI is, it is only an auxiliary tool and cannot really replace humans. As human beings, we must have the pride of being "human beings". We should build on our own "creativity and judgment" and gain the enhancement of "efficiency, speed and combination ability" through cooperation with AI.

The rapid development of AI will only promote the continuous progress of human beings and make themselves better!

So don't panic, it's useless to panic, go learn it!

Guess you like

Origin blog.csdn.net/caryxp/article/details/130165134