python code to read and write database

Preface: In the past, I used pymysql to read the database and operate cursors. It felt quite inconvenient.

The reading and writing methods introduced below need to import 3 packages first, pandas, sqlalchemy, pymysql , pip install. If you have problems importing packages, please see my python basic environment construction article.

1. Use pd.read_sql() to read data

read_sql(sql,con,index_col=‘None’,coerce_float=‘True’,params=‘None’,parse_dates=‘None’,columns=‘None’,chunksize:None=‘None’)

The read_sql method is used in pandas to execute a specified SQL statement query in the database or query the entire specified table, and return the query results in the DataFrame type.

The meaning of each parameter is as follows:

sql: the sql statement that needs to be executed con: the engine required to connect to the database, built with other database connection packages, such as SQLalchemy and pymysql
index_col: which column to choose as the index coerce_float: convert the numeric string to float
parse_dates: convert a certain column Convert date string to datetime data columns: select the columns you want to keep
chunksize: how many rows of data are output each time

Sample code:

import pandas as pd
import sqlalchemy

engine = sqlalchemy.create_engine('mysql+pymysql://root:******@192.168.0.***:3306/test')

sql='''
select * from weather_test where
create_time between '2020-09-21' and '2020-09-22'
and city in ('杭州','上海')
'''
df = pd.read_sql(sql,engine)
df

Of course, the engine uses pymysql and will call pymysql. It must be in the python library. If not, just pip install it.
The main result returned by this reading method is a dataframe, which facilitates subsequent data manipulation.

If the sqlalchemy version is a higher version (such as 2.0.0), my local python3.9 is installed with 2.0.16 by default. You need to use the following writing method, while for python3.6 (1.4.49), the above writing method will work.

from sqlalchemy import create_engine, text
df = pd.read_sql(text(sql), engine.connect())

2.to_sql() batch write

to_sql() is a function of pandas, used to insert data into the database in batches, which contains the following parameters

DataFrame.to_sql（name，con，flavour = None，schema = None，
if_exists=‘fail’，index = True，index_label = None，chunksize = None，dtype = None）

The parameters required are:

name: the name of the table that needs to be operated
con: the connector with the database
if_exists: if the table already exists in the database, perform those operations (1. fail: throw an error and interrupt execution 2. replace replaces the current table 3. append: Insert data into an existing table) The default is fail
index: Whether to insert the index column of the DataFrame into the database table as a column of data. The default is True
index_label: The column label of the index column, the default is None chunksize: each How many rows of data to insert at a time, the default is None, all data is written in batches at one time
dtype: dictionary format, key is the field name, value is the data format corresponding to the field. You can specify the data format of each field when inserting data. The default is None

have to be aware of is:

The pandas official website explains that the connector used by to_sql is preferably the connector established by SQLAlchemy.
Before using to_sql, you need to ensure that the column names of the dataframe are equal to the column names of the table to be imported.

Sample code:

import sqlalchemy 
import pandas as pd
conn =sqlalchemy.create_engine('mysql+pymysql://****:******@192.168.0.***:3306/test')
dataframe.to_sql('table_name',con=conn,index=False,if_exists='append')