https://blog.csdn.net/weixin_37226516/article/details/64137043
on: Column name, the name of the column used by join for alignment. When using this parameter, you must ensure that the column used for alignment in the left and right tables has the same column name.
left_on: Columns aligned to the left table, which can be column names or arrays of the same length as the dataframe.
right_on: Columns aligned to the right table, which can be column names or arrays of the same length as the dataframe.
left_index/ right_index: If it is True haunted, use index as the alignment key
how: The method of data fusion.
sort: According to the dataframe merged keys are sorted in dictionary order, the default is, if set to false, performance can be improved.
import pandas as pd
rating_path = "./ant-learn-pandas-master/datas/movielens-1m/ratings.dat"
users_path = "./ant-learn-pandas-master/datas/movielens-1m/users.dat"
movies_path = "./ant-learn-pandas-master/datas/movielens-1m/movies.dat"
ratings = pd.read_csv(rating_path, sep='::', engine='python', names="UserID::MovieID::Ratings::TimeStamp".split("::"))
users = pd.read_csv(users_path, sep='::', engine='python', names="UserID::Gender::Age::Occupation::Zip-Code".split("::"))
movies = pd.read_csv(movies_path, sep='::', engine='python', names="MovieID::Titles::Genres".split("::"))
Two merge
ratings_user = pd.merge(
ratings, users, left_on="UserID", right_on="UserID", how="inner"
)
If you encounter a column or two tables with the same name in the process of merging with the table, but the values are different, and you want to keep it when you merge, you can use suffixes to add a suffix to the duplicate column names of each table.
result = pd.merge(left, right, on=‘k’, suffixes=[’_l’, ‘_r’])
Another merge method is concat