Pandas is an open source toolkit that provides data manipulation and analysis capabilitiesfor Python programming . This library has become a must-have tool for data scientists and analysts. It provides an efficient way to manage structured data ( Series and DataFrame ).
In the field of artificial intelligence , Pandas is often used as a preprocessing step in machine learning and deep learning processes. By providing data cleansing , reshaping , merging, and aggregation , Pandas can convert raw datasets into structured, ready-to-use 2-dimensional tables that can be fed into artificial intelligence algorithms .
Project address: add link description
Install Pandas AI using pip
pip install pandasai
Use OpenAI to import PandasAI
In the next step, we will import the pandasai library we installed earlier , and then import the LLM (Large Language Model) function . As of May 2023, pandasai only supports OpenAI models , which we will use to understand the data.
import pandas as pd
from pandasai import PandasAI
# Sample DataFrame
df = pd.DataFrame({
"country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
"gdp": [19294482071552, 2891615567872, 2411255037952, 3435817336832, 1745433788416, 1181205135360, 1607402389504, 1490967855104, 4380756541440, 14631844184064],
"happiness_index": [6.94, 7.16, 6.66, 7.07, 6.38, 6.4, 7.23, 7.22, 5.87, 5.12]
})
# Instantiate a LLM
from pandasai.llm.openai import OpenAI
llm = OpenAI(api_token="your_API_key")
pandas_ai = PandasAI(llm)
pandas_ai.run(df, prompt='Which are the 5 happiest countries?')
6 Canada
7 Australia
1 United Kingdom
3 Germany
0 United States
Name: country, dtype: object
To use the OpenAI API, you must generate your own unique API key .
Because of the characteristics of pandas, we can not only process csv files , but also connect to relational databases, such as pgsql :
# creating the uri and connecting to database
pg_conn = "postgresql://YOUR URI HERE"
#Query sql database
query = """
SELECT *
FROM table_name
"""
#Create dataframe named df
df = pd.read_sql(query,pg_conn)
Then, like the code above, we can talk to it directly:
# Using pandas-ai!
pandas_ai = PandasAI(llm)
pandas_ai.run(df, prompt='Place your prompt here)
Of course, you can also let PandasAI do more complex queries . For example, PandasAI can be asked to sum the GDP of the 2 least happy countries:
pandas_ai.run(df, prompt='What is the sum of the GDPs of the 2 unhappiest countries?')
The above code will return the following:
19012600725504
You can also ask PandasAI to draw:
pandas_ai.run(
df,
"Plot the histogram of countries showing for each the gpd, using different colors for each bar",
)
at last
ChatGPT , Pandas are powerful tools that when combined can revolutionize the way we interact with and analyze data . ChatGPT, with its advanced natural language processing capabilities , enables more intuitive human-like interactions with data. And PandasAI can enhance the Pandas data analysis experience. By converting complex data manipulation tasks into simple natural language queries, PandasAI makes it easier for users to extract valuable insights from data without writing a lot of code .
This is a new way of programming for those who are not yet familiar with Python or pandas operations/transformations. We don't need to program the task you want to perform, but just talk to the AI agent, tell it explicitly the desired result, and the agent will convert this message into a computer-interpretable code, and return the result.
I have sorted out the detailed Python information here and uploaded it to the official CSDN. If you don’t want to download through the website, you can scan the QR code below to get it.
1. Study Outline
2. Development tools
3. Basic data
4. Actual Combat Data