Recently, I was working on a neural network prediction project, and encountered some problems in the process.
Because the data is stored in mongodb, and many operations on the data are required in the project. For example, conditional query, conditional change, if you simply rely on the method that comes with mongodb. too slow. . . . , the following describes my method
1 python connect mongodb
from pymongo import MongoClient #If not, pip instal mongoClient is ok conn = MongoClient(ip, 27017<port>) db_auth = conn.admin #Authenticate, if not, you can not write db_auth.authenticate(username,pwd) db = conn.admin #admin refers to the database to be connected, which can be customized collection = db[collection_name]
2 pandas converts mongodb data into Dataframe
import pandas as pd#import pandas data = pd.DataFrame(list(collection.find()))
3 pandas merges two Dataframes to implement multi-table joint query in sql
data3 = pd.merge(data1,data2,how = "left",on="index")
Merge the two Dataframes data1 and data2. The method of merging is "left", that is, the left table is used as the standard, and the data1 is filled in the right table according to the column name set by the on parameter. The redundant data in data2 will be discarded, and the data in data1 will be saved. The merged table is stored in data3