It is an important process of data cleaning. It can be calculated according to index alignment. If the position is not aligned, NaN can be filled, and NaN can also be filled at the end
Python crawler, data analysis, website development and other case tutorial videos are free to watch online
https://space.bilibili.com/523606542
Python learning exchange group: 1039645993
Series alignment operation
1. Series aligned by row and index
Sample code:
s1 = pd.Series(range(10, 20), index = range(10))
s2 = pd.Series(range(20, 25), index = range(5))
print('s1: ' )
print(s1)
print('')
print('s2: ')
print(s2)
operation result:
s1:
0 10
1 11
2 12
3 13
4 14
5 15
6 16
7 17
8 18
9 19
dtype: int64
s2:
0 20
1 21
2 22
3 23
4 24
dtype: int64
2. Series alignment operation
Sample code:
# Series 对齐运算
s1 + s2
operation result:
0 30.0
1 32.0
2 34.0
3 36.0
4 38.0
5 NaN
6 NaN
7 NaN
8 NaN
9 NaN
dtype: float64
DataFrame alignment operation
1. DataFrame aligned by row and column index
Sample code:
df1 = pd.DataFrame(np.ones((2,2)), columns = ['a', 'b'])
df2 = pd.DataFrame(np.ones((3,3)), columns = ['a', 'b', 'c'])
print('df1: ')
print(df1)
print('')
print('df2: ')
print(df2)
operation result:
df1:
a b
0 1.0 1.0
1 1.0 1.0
df2:
a b c
0 1.0 1.0 1.0
1 1.0 1.0 1.0
2 1.0 1.0 1.0
2. DataFrame alignment operation
Sample code:
# DataFrame对齐操作
df1 + df2
operation result:
a b c
0 2.0 2.0 NaN
1 2.0 2.0 NaN
2 NaN NaN NaN
Fill unaligned data to perform operations
1. fill_value
When using add, sub, div, and mul, specify the fill value through fill_value, and the unaligned data will be calculated with the fill value
Sample code:
print(s1)
print(s2)
s1.add(s2, fill_value = -1)
print(df1)
print(df2)
df1.sub(df2, fill_value = 2.)
operation result:
# print(s1)
0 10
1 11
2 12
3 13
4 14
5 15
6 16
7 17
8 18
9 19
dtype: int64
# print(s2)
0 20
1 21
2 22
3 23
4 24
dtype: int64
# s1.add(s2, fill_value = -1)
0 30.0
1 32.0
2 34.0
3 36.0
4 38.0
5 14.0
6 15.0
7 16.0
8 17.0
9 18.0
dtype: float64
# print(df1)
a b
0 1.0 1.0
1 1.0 1.0
# print(df2)
a b c
0 1.0 1.0 1.0
1 1.0 1.0 1.0
2 1.0 1.0 1.0
# df1.sub(df2, fill_value = 2.)
a b c
0 0.0 0.0 1.0
1 0.0 0.0 1.0
2 1.0 1.0 1.0
Arithmetic method table: