pandas data analysis reading notes (4)

Pd.merge(left = df1, right = df2, on ='key', left_on ='lkey', right_on ='rkey', how ='inner', suffixes = ['_left','_right'], left_index = True, right_index = True), to connect two dfs is to add up the columns. The on parameter is to connect the specified columns. You can also connect according to multiple columns. If the column names of the two dfs are different, you can also do it separately Specify. How represents the way of connection. There are inner, outer, left, and right, which respectively represent inner connection, outer connection, left outer connection, and right outer connection. Suffixes is used to deal with the problem of duplicate column names. Left_index uses the row index on the left as its connection key, and right_index uses the index on the right as the connection key. It can also be mixed. The index on the left and the column on the right are used.

 

Left2.join(right2, how ='outer', on ='key'), merge by index, the left is the index, the right is the index, you can also specify the right column

 

Pd.concat([s1, s2], axis = 0, join ='outer', join_axes = ['a','c','b'], ignore_index, keys = ['one','two', ' three']), s1 and s2 are two DataFrames or Series, which are superimposed from the perspective of rows, which is equivalent to superimposing one by one like bricks. You can also add axis = 1 to superimpose from the column direction. The join parameter Refers to the way of merging, inner is transaction, outer is union (the default is union), you can also use join_axes to specify the superimposed column, ignore_index parameter is to delete the row index of the two merged DataFrame, and generate a new slave 0 starts as an index, and keys are used to generate a hierarchical index

 

Np.where(pd.isnull(a), b, a), similar to if else

 

Df.stack(), rotate the column of data into rows

Df.unstack(), rotate the rows of data into columns. The default is to rotate the innermost layer into columns. Of course, you can also pass in the number or name of the level. If it is a multi-level index, the rotated row will become the lowest level. Column of

 

Fig. = plt.figure()

Ax1 = fig.add_subplots(2, ,2, 1)

Ax1.hist(np.random.randn(100), bins = 20, color = ‘k’, alpha = 0.5)

The above is to create a figure object first, then create multiple subplots on this figure object, and then draw the figure on this ax.

 

There is another way to create a figure and return some subplot object numpy data at the same time, such as:

Fig, axes = plt.subplots(2, 3), returns the figure object, and a 2 * 3 numpy array, which can be called like this: axes[0, 1]

 

Ax[0, 1].plot(x, y,'g--'), g is the color green, - is the line type, indicating a dashed line

Ax[0, 1].set_xticks([0, 250, 500, 750, 1000]) is used to draw ticks

Ax[0, 1].set_xticklabels(['one','two','three','four','five'], rotation = 30), this function is to label the scale

ax[0, 1].legend(loc ='best'), add a legend, best means to automatically set the position, for example, the following is the legend:

Guess you like

Origin blog.csdn.net/u012724887/article/details/107100505