Import data into postgresql in batches based on pandas.to_sql

focus point:

  1. How to create a database connection
  2. Test whether the connection to the database is successful
  3. to_sql related parameters

Version Information:

  • Python 3.6
  • pandas 0.24.2
  • postgresql 11

Database connection creation

You cannot use psycopg2.connect, you need to use create_engine
reference documentation

    engine = create_engine('postgres://' 
                           + 'zentao' + ':' 
                           + 'zentao' + '@' 
                           + '127.0.0.1' + ':' 
                           + '5432' + '/' 
                           + 'test')

Test whether the connection to the database is successful

     DB_Session = sessionmaker(bind=engine)
     session = DB_Session()
     data = session.execute("SELECT * FROM table limit 2")
     for row in data:
         for col in row:
             print(col),
         print  
     session.close()

to_sql related parameter
reference document

Correct code:

    df.to_sql(name='table_name', schema='schema_name', con=engine, 
              if_exists='append' ,index=False)

The point is that for scenarios with schema,
if the name is passed directly into schema_name.table_name, it will not be recognized.

Wrong code:

    df.to_sql(name='schema_name.table_name', con=engine, 
              if_exists='append' ,index=False)

The above code will not report an error, but no data is generated in the database

Therefore, schema and table must be written separately.

Guess you like

Origin blog.csdn.net/weixin_44325637/article/details/102815405