Problems encountered when Pandas' to_sql() inserts data into mysql

Using pymysql to drive the API, the following error occurs:

DatabaseError: Execution failed on sql ‘SELECT name FROM sqlite_master WHERE type=‘table’ AND name=?;’: not all arguments converted during string formatting

1. Problems encountered when inserting data from pandas data table into mysql

1.1. pymysql driver interface problem

According to past experience, mysql operation is very simple, just install the pymysql driver.

pip install pymysql

Write the code to write to the database through pandas to_sql.

import pandas as pd
import datetime
import uuid  # 数据库主键唯一

import pymysql

def mysql_db():
    # 连接数据库肯定需要一些参数
    conn = pymysql.connect(
        host="192.168.**.**",
        port=3306,
        database="M*****DB",
        charset="utf8",
        user="ty",
        passwd="****"
    )
    
    return conn
# df = 略去
# 增加唯一主键uuid
id = []
for i in range(len(df)):
    id.append(uuid.uuid1())
    
# 增加记录写入时间
df['CreateTime'] = datetime.datetime.now()
df['id'] = id

# 下面是出现问题的地方
conn = mysql_db()
tablename = '******'
dd = df[colsname]
# 浮点型截断数据处理
for k,v in cols_len.items():
    dd[k] = dd[k].round(v)
# 下面是出现问题的地方
dd.to_sql(tablename, conn,index=False, if_exists='append')

Execute the program and report the following questions:

DatabaseError: Execution failed on sql ‘SELECT name FROM sqlite_master WHERE type=‘table’ AND name=?;’: not all arguments converted during string formatting

Try to execute the query statement:

df0=pd.read_sql('select * from S*******y',conn)
df0

Everything works fine and the result is as follows:
insert image description here

1.2. About the database primary key

The primary key generally uses uuid, and the uuid module in python generates uuid based on information such as MAC address, timestamp, namespace, random number, and pseudo-random number. The specific methods are as follows:

  • uuid.uuid1(): Generate a unique uuid based on MAC address, timestamp, and random number, which can guarantee global uniqueness.

  • uuid.uuid2(): The algorithm is the same as uuid1, the difference is that the first 4 bits of the timestamp are replaced with POSIX UIDs. However, it should be noted that there is no DCE-based algorithm in python, so there is no uuid2 method in python's uuid module.

  • uuid.uuid3(namespace,name): A uuid is given by calculating the md5 hash value of a namespace and name, so it can be guaranteed that different names in the namespace have different uuids, but the same name is the same uuid . Among them, namespace is not a string or other quantity manually specified by oneself, but some values ​​given in the uuid module itself. Such as uuid.NAMESPACE_DNS, uuid.NAMESPACE_OID, uuid.NAMESPACE_OID these values. These values ​​themselves are also UUID objects, calculated according to certain rules.

  • uuid.uuid4(): Obtain uuid by pseudo-random number, which has a certain probability of repetition

  • uuid.uuid5(namespace,name): basically the same as uuid3, except that the hash algorithm used is sha1.

When used, according to the number of records in the data set, a list of uuids is generated at one time and incorporated into the table.

2. The to_sql() of pandas has a hidden pit

Solution:

I checked the pandas official website and many other articles about to_sql, but I didn’t see any mention of the pit of database connection, but there is one thing in common between the to_sql example on the pandas official website and the to_sql in other articles, that is, citing the third party sqlalchemy The ORM library is used for connection, so as to solve the error of to_sql.

First, install sqlalchemy.

pip install sqlalchemy

Modify the code and add functions:

def mysql_engine_db():
    # 连接数据库肯定需要一些参数
    engine = create_engine(
        'mysql+pymysql://ty:***@192.168.**.**:3306/M*******DB'
        # mysql+pymysql://用户: 密码@url: 端口/数据库
    )
    
    return engine

# 下面是出现问题的地方
engine= mysql_engine_db()
tablename = '******'
dd = df[colsname]
# 浮点型截断数据处理
for k,v in cols_len.items():
    dd[k] = dd[k].round(v)
# 下面是出现问题的地方
dd.to_sql(tablename, engine,index=False, if_exists='append')   

Such an easy problem to solve!

reference:

宠乖仪. DatabaseError: Execution failed on sql ‘SELECT name FROM sqlite_master WHERE type=‘table‘ AND name=?. CSDN博客. 2021.09

weixin_43425561. DatabaseError: Execution failed on sql ‘SELECT name FROM sqlite_master WHERE type=‘table‘ AND name=?. CSDN博客. 2020.11

Qi Xiaobin. Python's uuid . Short Book. 2022.01

Guess you like

Origin blog.csdn.net/xiaoyw/article/details/131126161