Python SQL and NoSQL database operation combat

1. Python access and operation relational database actual combat

1. Relational database

Relational databases have long been the standard for storing and manipulating data. The technology is mature and ubiquitous.

Python can connect to a variety of relational databases, but Python handles all databases in roughly the same way, so here we will use one of the databases, sqlite3, to demonstrate the basic principles, and then discuss some differences and precautions when choosing and using relational databases for data storage matter.

2. Usage of sqlite3 database

Python provides many modules for various databases, but the following examples will only introduce sqlite3. While not suitable for large, high-traffic applications, sqlite3 offers two advantages.

Because sqlite3 is part of the standard library, it can be used anywhere a database is needed, without having to worry about adding dependencies.
sqlite3 stores all records in local files, so there is no need for client and server side, which is what PostgreSQL, MySQL and other large databases need.

The above features make sqlite3 a convenient choice for small applications and rapid prototyping systems.

In order to use the sqlite3 database, there must first be a Connection object. Just call the connect function to get a Connection object, and the parameter is the file name to be used to store the data:

>>> import sqlite3
>>> conn = sqlite3.connect("datafile.db")

You can also use ":memory:" as the file name, so the data will be saved in memory. If you are storing Python integers, strings, and floats, no other parameters are required.

If you want sqlite3 to automatically convert the query results of some columns to other types, it is more useful to bring the detect_types parameter. Setting it to sqlite3.PARSE_DECLTYPES|sqlite3.PARSE_COLNAMES can guide the Connection object to the column names and types in the query statement to parse and try to match them with custom converters.

The second step is to create a Cursor object from Connection:

>>> cursor = conn.cursor()
>>> cursor
<sqlite3.Cursor object at 0xb7a12980>

At this point, the database can be queried. In this case, since there are no tables or records in the database yet, we need to first create a table and insert a few records:

>>> cursor.execute("create table people (id integer primary key, name text,
     count integer)")
>>> cursor.execute("insert into people (name, count) values ('Bob', 1)")
>>> cursor.execute("insert into people (name, count) values (?, ?)", 
...               ("Jill", 15))
>>> conn.commit()

The last insert statement demonstrates the recommended way of writing queries with variables. Here, instead of building a query string, "?" is used to represent each variable, which is safer, and then multiple variables are formed into a tuple, which is passed to the execute method as a parameter. The advantage of this is that you don't have to worry about escaping errors, and sqlite3 will handle it.

Variable names with a ":" prefix can also be used in the query, and the passed in will be a dictionary containing the corresponding values ​​​​to be inserted:

>>> cursor.execute("insert into people (name, count) values (:username, \
                   :usercount)", {"username": "Joe", "usercount": 10})

After filling the data in the table, you can use SQL commands to query the data, or you can use "?" to represent the variable binding relationship, or use variable names and dictionaries:

>>> result = cursor.execute("select * from people")
>>> print(result.fetchall())
[('Bob', 1), ('Jill', 15), ('Joe', 10)]
>>> result = cursor.execute("select * from people where name like :name", 
...                         {"name": "bob"})
>>> print(result.fetchall())
[('Bob', 1)]
>>> cursor.execute("update people set count=? where name=?", (20, "Jill"))
>>> result = cursor.execute("select * from people")
>>> print(result.fetchall())
[('Bob', 1), ('Jill', 20), ('Joe', 10)]

In addition to the available fetchall method, the fetchone method can obtain a row of data from the query result, and fetchmany can return any number of data rows. For convenience, you can also iterate through the data rows in the cursor object, similar to iterating over files:

>>> result = cursor.execute("select * from people") 
>>> for row in result:
...     print(row) 
...
('Bob', 1)
('Jill', 20)
('Joe', 10)

By default, sqlite3 does not commit transactions immediately. This means that you can choose to roll back the transaction when the transaction fails, but it also means that you need to use the commit method of the Connection object to ensure that all modifications are saved. Because the close method does not automatically commit any active transactions, it is especially good practice to commit before closing the database connection:

>>> cursor.execute("update people set count=? where name=?", (20, "Jill"))
>>> conn.commit()
>>> conn.close()

Common sqlite3 database operations:

operate sqlite3 command
Create database connection conn = sqlite3.connect(filename)
Create a cursor on the database connection Cursor = conn.cursor()
Execute query through cursor cursor.execute(query)
return query result

cursor.fetchall()、cursor.fetchmany(num_rows)、cursor.fetchone()

for row in cursor:....

Submit the transaction to the database conn.commit()
close database connection conn.close()

Usually, it is enough to operate the sqlite3 database with the above operations.

3. Use of MySQL, PostgreSQL and other relational databases

As mentioned earlier, several other SQL databases provide client libraries that follow the DB-API specification. Therefore, accessing these databases in Python is very similar, but there are a few differences to be aware of.

  • Unlike SQLite, these databases require a database server for clients to connect to. The client and server may be located on the same machine or on different machines, so the database connection needs to carry more parameters, usually including the host, account name and password.
  • The way of inserting parameters in the query like "select * from test where name like:name" may use different formats, similar to ?, %s5(name)s, etc.

The above changes are not huge, but they tend to prevent the code from being fully portable between different databases.

4. Use ORM to simplify database operations

There are some problems with the DB-API database client library mentioned earlier, and the requirement to write raw SQL statements will also have problems.

  • Different SQL databases implement SQL slightly differently, so the same SQL statement may not always work if you migrate from one database to another. If the local development is carried out on sqlite3, then MySQL or PostgreSQL will be used in the production environment, then this kind of migration requirement may arise. In addition, as mentioned earlier, different database products have different implementation methods, such as the way to pass parameters to query statements.
  • The second disadvantage is the need to use raw SQL statements. Including SQL statements in the code will be more difficult to maintain, especially when the code size is large. At this point some of these statements will become templates and routine procedures, others will be very complicated and tricky. And all statements need to be tested, which can become cumbersome.
  • The requirement to write SQL means that at least two languages ​​need to be considered: Python and some kind of SQL. Raw SQL is worth the trouble in many cases, but not in many other cases.

In view of the above problems, people need Python to have a more manageable way of dealing with databases, and only need to write ordinary Python code. The solution is an Object Relational Mapper (ORM), which converts or maps relational database data types and data structures into Python objects.

In the Python world, the two most common ORMs are Django ORM and SQLAlchemy, but there are many others. Django ORM is tightly integrated with the Django web framework and is usually not used outside alone. Just one thing to note here, Django ORM is the default option for Django applications and is a good choice, with well-developed tools and rich community support.

1. SQLAlchemy

SQLAlchemy is another big name ORM in the Python world. The goal of SQLAlchemy is to automate lengthy database tasks and provide a Python object-based interface to the data, while still allowing developers to control the database and access the underlying SQL. Here are some basic examples of storing data in a relational database and then retrieving it with SQLAlchemy.

SQLAlchemy can be installed in a Python environment with pip:

> pip install sqlalchemy

Note: From the perspective of using SQLAlchemy and related tools, it is more convenient to open two shell windows in the same virtual environment, one for Python and the other for the system command line.

SQLAlchemy provides several ways to interact with databases and tables. Although ORM also allows you to write SQL statements if necessary, the power of ORM is exactly what its name suggests: mapping relational database tables and columns to Python objects.

The following operations are repeated using SQLAlchemy: create a table, add 3 data rows, query the table and update 1 row. It's a bit more configuration work when using an ORM, but it's well worth it in larger projects.

First, several components need to be imported for connecting to the database and mapping tables into Python objects. In the basic sqlalchemy package, you need to use the methods create_engine and select, as well as the classes MetaData and Table. But because you need to specify schema information when creating a Table object, you also need to import the Column class and the corresponding classes of the data types of each column, Integer and String in this example.

You also need to import the sessionmaker function from the sqlalchemy.orm subpackage:

>>> from sqlalchemy import create_engine, select, MetaData, Table, Column, 
     Integer, String
>>> from sqlalchemy.orm import sessionmaker

Now you can consider connecting to the database:

>>> dbPath = 'datafile2.db'
>>> engine = create_engine('sqlite:///%s' % dbPath)
>>> metadata = MetaData(engine)
>>> people  = Table('people', metadata, 
...                 Column('id', Integer, primary_key=True),
...                 Column('name', String),
...                 Column('count', Integer),
...                )
>>> Session = sessionmaker(bind=engine)
>>> session = Session()
>>> metadata.create_all(engine)

In order to create and connect to a database, an engine corresponding to the database needs to be created. Then there needs to be a MetaData object, which is a container for managing the table and its structure. Then create a Table object named people, and the parameters given are the table name in the database, the MetaData object just created, the column to be created and its data type. Finally, use the sessionmaker function to create a Session class for the engine, and use this class to instantiate the session object. At this time, the database has been connected, and the last step is to use the create_all method to build a table.

After the database table is created, the next step is to insert some records. There are also multiple insertion methods in SQLAlchemy, but in this case it will be more explicit. Create an insert object, then execute:

>>> people_ins = people.insert().values(name='Bob', count=1)
>>> str(people_ins)
'INSERT INTO people (name, count) VALUES (?, ?)'
>>> session.execute(people_ins)
<sqlalchemy.engine.result.ResultProxy object at 0x7f126c6dd438>
>>> session.commit()

The insert() method is used here to create an insert object and specify the fields and values ​​to be inserted. people_ins is the insert object, and it can be shown with the str() function that the correct SQL command is actually created behind the scenes.

Then use the execute() method of the session object to perform the insert operation, and submit it to the database with the commit() method:

>>> session.execute(people_ins, [
...     {'name': 'Jill', 'count':15},
...     {'name': 'Joe', 'count':10}
... ])
<sqlalchemy.engine.result.ResultProxy object at 0x7f126c6dd908>
>>> session.commit()
>>> result = session.execute(select([people])) 
>>> for row in result:
...     print(row) 
...
(1, 'Bob', 1)
(2, 'Jill', 15)
(3, 'Joe', 10)

You can simplify the operation and perform the insertion of multiple records by passing in a list of dictionaries. Each inserted record is a dictionary containing field names and field values:

>>> result = session.execute(select([people]).where(people.c.name == 'Jill')) 
>>> for row in result:
...     print(row)
...
(2, 'Jill', 15)

The select() method can also be used with the where() method to find the specified records. The above example finds all records where the name column is equal to 'Jill'. Note that the where expression uses people.c.name, where c indicates that name is a column in the people table:

>>> result = session.execute(people.update().values(count=20).where
      (people.c.name == 'Jill'))
>>> session.commit()
>>> result = session.execute(select([people]).where(people.c.name == 'Jill'))
>>> for row in result:
...     print(row)
...
(2, 'Jill', 20)
>>>

The update() method can also be used in combination with the where() method to update a single record.

So far, table objects have been used directly. In fact, SQLAlchemy can also be used to map tables directly to classes. The advantage of this mapping technique is that the data columns are mapped directly to the attributes of the class.

Let's create a class People for demonstration:

>>> from sqlalchemy.ext.declarative import declarative_base
>>> Base = declarative_base()
>>> class People(Base):
...     __tablename__ = "people"
...     id = Column(Integer, primary_key=True)
...     name = Column(String)
...     count = Column(Integer) 
...
>>> results = session.query(People).filter_by(name='Jill') 
>>> for person in results:
...     print(person.id, person.name, person.count) 
...
2 Jill 20

As long as you create a new instance of the mapping class and add it to the session, you can complete the insertion operation:

>>> new_person = People(name='Jane', count=5)
>>> session.add(new_person)
>>> session.commit()
>>> 
>>> results = session.query(People).all() 
>>> for person in results:
...     print(person.id, person.name, person.count)
...
1 Bob 1
2 Jill 20
3 Joe 10
4 Jane 5

The update operation is also fairly simple. Just retrieve the records that need to be updated, modify the values ​​in the mapping instance, and then add the updated records to the writeback database session:

>>> jill = session.query(People).filter_by(name='Jill').first()
>>> jill.name
'Jill'
>>> jill.count = 22
>>> session.add(jill)
>>> session.commit()
>>> results = session.query(People).all()
>>> for person in results:
...     print(person.id, person.name, person.count) 
...
1 Bob 1
2 Jill 22
3 Joe 10
4 Jane 5

The deletion operation is very similar to the modification. First obtain the record to be deleted, and then use the delete() method of the session object to delete it:

>>> jane = session.query(People).filter_by(name='Jane').first()
>>> session.delete(jane)
>>> session.commit()
>>> jane = session.query(People).filter_by(name='Jane').first()
>>> print(jane)
None

Using SQLAlchemy does add a bit more configuration work than just using raw SQL, but brings some real benefits. First of all, adopting ORM means not having to worry about the nuances of SQL statements supported by different databases. The above example works fine for sqlite3, MySQL, and PostgreSQL, except that the strings provided when creating the engine are different, and making sure that the correct database driver is available, there is no need to make any changes to the code.

Another advantage is that data interaction can be done through Python objects, which may be easier for programmers who lack SQL programming experience. They can use Python objects and their methods instead of constructing SQL statements.

2. Modify the database structure with Alembic

During development using relational databases, it is often necessary to modify the database structure after work has begun. If this isn't common, it's at least common. To add fields, to modify field types, whatever. Of course, it is possible to manually modify the database tables and the ORM code that accesses them, but there will be some drawbacks. First, such modifications are difficult to roll back if necessary. Second, the database configuration used by a certain version of the code is also difficult to track.

The solution is to use the database migration (migration) tool to assist in the modification and record the changes. Migrations are manipulated in code, which should contain both the modifications and inverse operations needed to perform them. This way changes can be recorded and executed or rolled back in the correct order. This allows the database to be reliably upgraded or downgraded to any state during development.

As an example, here's a brief introduction to Alembic, a popular lightweight migration tool for SQLAlchemy. In order to start Alembic, please switch to the system command line window, enter the directory where the project is located, install Alembic, and use alemic init to create a common operating environment:

> pip install alembic
> alembic init alembic

The above code creates the file structure required for data migration with Alembic. Here's an alembic.ini file that needs to be edited in at least one place.

The squalchemy.url line needs to be modified according to the current situation:

sqlalchemy.url = driver://user:pass@localhost/dbname

Change this line to:

sqlalchemy.url = sqlite:///datafile.db

Since a local sqlite file is used, no username or password is required.

The next step is to create a revision with Alembic's revision command:

> alembic revision -m "create an address table"
Generating /home/naomi/qpb_testing/alembic/versions/ 
     384ead9efdfd_create_a_test_address_table.py ... done

The above code creates a revision script 384ead9efdfd_create_a_test_address_table.py in the alembic/versions directory, the script file looks like this:

"""create an address table

Revision ID: 384ead9efdfd 
Revises: 
Create Date: 2017-07-26 21:03:29.042762

""" 
from alembic import op 
import sqlalchemy as sa

# revision identifiers, used by Alembic. 
revision = '384ead9efdfd' 
down_revision = None 
branch_labels = None 
depends_on = None

def upgrade():
    pass

def downgrade():
    pass

In the header information of the file, the revision ID and date are included. The file also contains a down_revision variable, which is used to guide the rollback of each version. If there is a second revision, its down_revision variable should contain that revision ID.

In order to perform a revision, to update the revision script, give the code to perform the revision in the upgrade() method, and give the fallback code in the downgrade() method:

def upgrade():
    op.create_table(
        'address',
        sa.Column('id', sa.Integer, primary_key=True),
        sa.Column('address', sa.String(50), nullable=False),
        sa.Column('city', sa.String(50), nullable=False),
        sa.Column('state', sa.String(20), nullable=False),
    )

def downgrade():
    op.drop_table('address')

After the above code is created, the upgrade can be performed. But first, please switch back to the Python shell window and check which tables exist in the database:

>>> print(engine.table_names())
['people']

As expected, there is only one table created earlier. Now, you can run Alembic's upgrade command to perform the upgrade and add a new table. Please switch to the system command line window and run:

> alembic upgrade head
INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
INFO  [alembic.runtime.migration] Running upgrade  -> 384ead9efdfd, create an
     address table

If you go back to the Python shell window to check, you will find that there are two more tables in the database:

>>> engine.table_names()
['alembic_version', 'people', 'address'

The first new table 'alembic version' is created by Alembic to record the current version of the database (for future reference for upgrades and downgrades). The second new table 'address' was joined by the upgrade operation and is ready.

If you want to roll back the state of the database to the previous state, you only need to run the Alembic downgrade command in the system command window. Please add the "-1" parameter to the downgrade command to tell Alembic to downgrade a version:

> alembic downgrade -1
INFO  [alembic.runtime.migration] Context impl SQLiteImpl.
INFO  [alembic.runtime.migration] Will assume non-transactional DDL.
INFO  [alembic.runtime.migration] Running downgrade 384ead9efdfd -> , create
    an address table

Now if you log in to the Python session, you will be back to the starting state, but the version record table will still exist:

>>> engine.table_names()
['alembic_version', 'people']

Of course, you can run upgrade again whenever you like, upgrading the table back again, adding new revisions, doing the upgrade, and so on.

2. Python access and operation of non-relational database combat

1. NoSQL database

Despite their age, relational databases are not the only option for storing data. All a relational database does is normalize the data in relational tables, whereas other schemes look at the data differently. Often, these types of databases are called NoSQL databases because they generally do not follow the row, column, table structure that SQL was created to describe.

Instead of treating data as a collection of rows, columns, and tables, NoSQL databases can view stored data as key/value pairs, indexed documents, or even graphs. There are many NoSQL databases available, and they all handle data in somewhat different ways. In general, these data are unlikely to be strictly normalized, and normalization can make information retrieval easier and faster.

Here is how Python accesses two common NoSQL databases: Redis and MongoDB. Just scratched the surface of NoSQL databases and Python capabilities, but should give a rough idea of ​​what can be done.

For those who are more familiar with Redis or MongoDB, you can learn a little bit about how to use the Python client library. For those new to NoSQL databases, it's good to at least get a feel for how these databases work.

2. Implement key/value storage with Redis

Redis is a memory-based networked key/value storage system. Because the values ​​are stored in memory, lookups can be very fast, and the design of access over the network makes it suitable for many occasions. Redis is commonly used as a cache, message broker, and fast information retrieval system. Redis gets its name from remote dictionary server (remote dictionary server), which is actually the best way to think of it, and it behaves much like a Python dictionary turned into a web service.

The following example demonstrates how to use Redis in Python. If you are familiar with the Redis command line interface, or have used the Redis client in other programming languages, then these small routines should provide guidance for using Redis in Python.

Although there are several Python Redis clients available, according to the official Redis website, the recommended solution is redis-py, which can be installed with pip install redis.

To run the Redis server and perform code testing, a normally running Redis server is required. While cloud-based Redis services are available, for code testing it's better to use a Docker instance or install a service yourself on your computer.

If you already have Docker installed, using a Docker instance is probably the quickest and easiest way to get a Redis server up and running. Using a command like > docker run -p 6379:6379 redis should start the Redis instance on the command line.

On Linux systems, it should be fairly easy to install Redis using the system package manager. In the Mac system, as long as you use brew install redis, you should be able to install it normally.

With a working Redis server, here's a simple example of interacting with Redis through Python. First, you need to import the Redis library and create a Redis connection object:

>>> import redis
>>> r = redis.Redis(host='localhost', port=6379)

Several optional connection parameters can be taken when creating a Redis connection, including host, port, password or SSH credentials. If the Redis server is running on the default port 6379 of localhost, then there is no need to take optional parameters. Once you have a connection object, you can use it to access the key/value store.

The first thing to do may be to use the keys() method to get the list of keys in the database and return the list of keys currently stored. Then you can set some keys of various types and try to retrieve the corresponding values ​​through various methods:

>>> r.keys()
[]
>>> r.set('a_key', 'my value')
True
>>> r.keys()
[b'a_key']
>>> v = r.get('a_key')
>>> v
b'my value'
>>> r.incr('counter')
1
>>> r.get('counter')
b'1'
>>> r.incr('counter')
2
>>> r.get('counter')
b'2'

The above example demonstrates how to get a list of keys in a Redis database, set the key with a value, set the key with a variable counter and increment the variable.

The following examples will handle storage of arrays or lists:

>>> r.rpush("words", "one")
1
>>> r.rpush("words", "two")
2
>>> r.lrange("words", 0, -1)
[b'one', b'two']
>>> r.rpush("words", "three")
3
>>> r.lrange("words", 0, -1)
[b'one', b'two', b'three']
>>> r.llen("words")
3
>>> r.lpush("words", "zero")
4
>>> r.lrange("words", 0, -1)
[b'zero', b'one', b'two', b'three']
>>> r.lrange("words", 2, 2)
[b'two']
>>> r.lindex("words", 1)
b'one'
>>> r.lindex("words", 2)
b'two'

When the key is initially set, the list "words" is not yet in the database, but the act of appending or pushing a value to the end of the list will create the key, create an empty list as the value, and append the value 'one' to it . The r in rpush here means inserting from the right. Then use rpush to continue adding a word at the end. The value in the list can be retrieved with the lrange() function, the parameters are the key, the start index and the end index, and the index -1 indicates the end of the list.

Also note that with lpush() you can add values ​​to the beginning or to the left of the list. A single value is retrieved with lindex() in the same way as lranger(), but the index of the value is given.

The expiration time of the value. There is a Redis feature that is particularly useful for caching, the ability to set an expiration time for key/value pairs. Both key and value will be deleted after timeout. This technique is especially useful if you want to use Redis as a cache. When setting the value corresponding to the key, you can also set the timeout value in seconds:

>>> r.setex("timed", "10 seconds", 10)
True
>>> r.pttl("timed")
7165
>>> r.pttl("timed")
5208
>>> r.pttl("timed")
1542
>>> r.pttl("timed")
>>>

The above sets the "timed" expiration to 10 seconds. Then use the pttl() method to view the remaining time before expiration in milliseconds. Both the key and the value are automatically deleted from the database when the value expires. This feature, and the fine-grained control Redis provides over it, is really useful. For simple caching applications, the problem may be solved without writing any more code.

It's worth noting that Redis keeps data in memory, so keep in mind that data is non-persistent. If the server crashes, some data may be lost. In order to reduce the possibility of data loss, Redis has some optional parameters for managing persistence. You can choose to write each modification to disk, you can take snapshots at a predetermined time, or you can not store them in disk at all. You can also use the save() and bgsave() methods of the Python client to force a snapshot save programmatically. Save() will block the current operation until the save is complete, while bgsave() will perform the save operation in the background.

3. Documents in MongoDB

Another popular NoSQL database is MongoDB, sometimes called a document-based database. Because MongoDB does not arrange rows and columns, but only stores documents. MongoDB is designed to scale freely across multiple nodes across multiple clusters, with the ability to process billions of documents at the same time. In MongoDB, the format of document storage is called BSON (Binary JSON, Binary JSON), so documents are composed of key/value pairs, which look like JSON objects or Python dictionaries.

The following examples show how to use Python to interact with MongoDB collections and documents, and also give appropriate usage reminders. MongoDB is an excellent choice when the data needs to be scaled and distributed, the insertion rate is high, and the table structure is complex and uncertain. But in many cases, MongoDB is not the best choice. So be sure to conduct a thorough investigation of the requirements and options before making a choice.

Running the MongoDB server Like Redis, if you want to test MongoDB, you need access to the MongoDB server. There are plenty of cloud-hosted Mongo services available, but if you're just testing, it's probably best to run a Docker instance or install on your own server.

As in the case of Redis, the easiest solution is to run a Docker instance. If you already have Docker, just enter > docker run -p 27017:27017 mongo on the command line.

In the Linux system, it should be installed by the package manager, and in the Mac system, brew install mongodb can be used. In the Windows system, please visit the MongoDB official website for the Windows version and installation instructions. As with Redis, search online for instructions on how to configure and start the server.

As in the case of Redis, there are several Python client libraries for connecting to MongoDB databases. To demonstrate how they work, let's introduce pymongo. The first step in using pymongo is installation, which can be done with pip:

> pip install pymongo

After the pymongo installation is complete, you can create a MongoClient instance, specify the general connection information, and then connect to the MongoDB server:

>>> from pymongo import MongoClient
>>> mongo = MongoClient(host='localhost', port=27017)     ⇽---  host='localhost'和port=27017是默认值,不需要指定

The organizational structure of MongoDB includes a database that contains multiple collections, and each collection can contain multiple documents. But before accessing databases and collections, they don't need to be created first. If the database and collection do not exist, they are created automatically on insert, and retrieving records simply returns no results.

To be able to test the client, create a sample document, for example, a Python dictionary:

>>> import datetime
>>> a_document = {'name': 'Jane',
...               'age': 34,
...               'interests': ['Python', 'databases', 'statistics'],
...               'date_added': datetime.datetime.now()
... }
>>> db = mongo.my_data     ⇽---  选中一个尚未创建的数据库
>>> collection = db.docs   ⇽---  选中数据库中的一个集合,也尚未创建
>>> collection.find_one()  ⇽---  查询第一条记录,即使集合或数据库不存在也不会引发异常  
>>> db.collection_names()
[]

The above is connected to the database and document collection. They don't exist at this time, but are created when accessed. Note that no exception is thrown even if the database and collection do not exist. When requesting a list of collections, you get an empty list because there is no content in the collection yet.

If you want to store a document, please use the insert() method of the collection. If the operation is successful, the unique ObjectId of the document will be returned:

>>> collection.insert(a_document)
ObjectId('59701cc4f5ef0516e1da0dec')     ⇽---  唯一的ObjectId
>>> db.collection_names()
['docs']

Now that the documents are stored in the docs collection, the collection will be displayed when the collection name in the database is requested. Once documents are stored in a collection, they can be queried, updated, replaced, and deleted:

>>> collection.find_one()     ⇽---  获取第一条记录
{'_id': ObjectId('59701cc4f5ef0516e1da0dec'), 'name': 'Jane', 'age': 34, 
     'interests': ['Python', 'databases', 'statistics'], 'date_added': 
     datetime.datetime(2017, 7, 19, 21, 59, 32, 752000)}
>>> from bson.objectid import ObjectId
>>> collection.find_one({"_id":ObjectId('59701cc4f5ef0516e1da0dec')})     ⇽---  获取符合指定条件的记录,这里是用了ObjectId
{'_id': ObjectId('59701cc4f5ef0516e1da0dec'), 'name': 'Jane', 
     'age': 34, 'interests': ['Python', 'databases', 
     'statistics'], 'date_added': datetime.datetime(2017, 
     7, 19, 21, 59, 32, 752000)}
>>> collection.update_one({"_id":ObjectId('59701cc4f5ef0516e1da0dec')}, 
     {"$set": {"name":"Ann"}})     ⇽---  按照$set对象的内容对记录做出更新
<pymongo.results.UpdateResult object at 0x7f4ebd601d38>
>>> collection.find_one({"_id":ObjectId('59701cc4f5ef0516e1da0dec')})
{'_id': ObjectId('59701cc4f5ef0516e1da0dec'), 'name': 'Ann', 'age': 34, 
     'interests': ['Python', 'databases', 'statistics'], 'date_added': 
     datetime.datetime(2017, 7, 19, 21, 59, 32, 752000)}
>>> collection.replace_one({"_id":ObjectId('59701cc4f5ef0516e1da0dec')}, 
     {"name":"Ann"})     ⇽---  用新的对象将记录替换
<pymongo.results.UpdateResult object at 0x7f4ebd601750>
>>> collection.find_one({"_id":ObjectId('59701cc4f5ef0516e1da0dec')})
{'_id': ObjectId('59701cc4f5ef0516e1da0dec'), 'name': 'Ann'}
>>> collection.delete_one({"_id":ObjectId('59701cc4f5ef0516e1da0dec')})     ⇽---  删除符合条件的记录
<pymongo.results.DeleteResult object at 0x7f4ebd601d80>
>>> collection.find_one()

Note first that MongoDB will match against a dictionary of fields and their values. Dictionaries are also used to represent operators such as $lt (less than) and $gt (greater than), and commands such as $set to update records.

Another thing to note is that even if records are deleted and the collection is empty, the collection still exists unless you specify to delete the collection:

>>> db.collection_names()
['docs']
>>> collection.drop()
>>> db.collection_names()
[]

Of course, MongoDB can do a lot more. In addition to operating on one record, the same command also has versions that operate on multiple records, such as find_many and update_many. MongoDB also supports indexes to improve performance, and provides several methods for data grouping, counting, and aggregation, as well as built-in MapReduce methods.

Summarize:

  • Python comes with a set of database API (DB-API), which provides roughly consistent interfaces for clients of several relational databases.
  • Using an object-relational mapper (ORM) allows for more standardization of code across multiple databases.
  • Using ORM can also access relational databases through Python code and objects instead of SQL queries.
  • Tools like Alembic, combined with an ORM, can make reversible changes to a relational database's table structure in code.
  • Key/value storage systems such as Redis provide fast memory-based data access.
  • MongoDB provides a scalable data storage solution with a less rigid structure than relational databases.

Guess you like

Origin blog.csdn.net/qq_35029061/article/details/130138083