Pandas: Find maximum values & position in columns or rows of a Dataframe求Dataframe中行列的最大值

Pandas: Find maximum values & position in columns or rows of a Dataframe求Dataframe中行列的最大值
https://thispointer.com/pandas-find-maximum-values-position-in-columns-or-rows-of-a-dataframe/

df.max(axis=0)
时间 2019-06-16 23:55:00
流入峰值 4.07758e+06
流出峰值 2.03839e+06
max_val 4.07758e+06
dtype: object

df.max()
时间 2019-06-16 23:55:00
流入峰值 4.07758e+06
流出峰值 2.03839e+06
max_val 4.07758e+06
dtype: object

df.max()[1]
4077577.6974

df.max()[0]
Timestamp(‘2019-06-16 23:55:00’)

df.max()[2]
2038393.402

In this article we will discuss how to find maximum value in rows & columns of a Dataframe and also it’s index position.

DataFrame.max()
Python’s Pandas Library provides a member function in Dataframe to find the maximum value along the axis i.e.

Python

DataFrame.max(axis=None, skipna=None, level=None, numeric_only=None, **kwargs)
1
DataFrame.max(axis=None, skipna=None, level=None, numeric_only=None, **kwargs)
Important Arguments:

axis : Axis along which maximumn elements will be searched. For along index it’s 0 whereas along columns it’s 1
skipna : (bool) If NaN or NULL to be skipped . Default is True i.e. if not provided it will be skipped.
It returns the maximum value along the given axis i.e. either in rows or columns.

Let’s use this to find the maximum value among rows and columns,

Suppose we have a Dataframe i.e.

Python

List of Tuples

matrix = [(22, 16, 23),
(33, np.NaN, 11),
(44, 34, 11),
(55, 35, np.NaN),
(66, 36, 13)
]

Create a DataFrame object

dfObj = pd.DataFrame(matrix, index=list(‘abcde’), columns=list(‘xyz’))
1
2
3
4
5
6
7
8
9
10

List of Tuples

matrix = [(22, 16, 23),
(33, np.NaN, 11),
(44, 34, 11),
(55, 35, np.NaN),
(66, 36, 13)
]

Create a DataFrame object

dfObj = pd.DataFrame(matrix, index=list(‘abcde’), columns=list(‘xyz’))
Contents of the dataframe object dfObj are,

Vim

x y z
a 22 16.0 23.0
b 33 NaN 11.0
c 44 34.0 11.0
d 55 35.0 NaN
e 66 36.0 13.0
1
2
3
4
5
6
x y z
a 22 16.0 23.0
b 33 NaN 11.0
c 44 34.0 11.0
d 55 35.0 NaN
e 66 36.0 13.0
Get maximum values in every row & column of the Dataframe
Get maximum values of every column
To find maximum value of every column in DataFrame just call the max() member function with DataFrame object without any argument i.e.

Python

Get a series containing maximum value of each column

maxValuesObj = dfObj.max()
print('Maximum value in each column : ')
print(maxValuesObj)
1
2
3
4
5

Get a series containing maximum value of each column

maxValuesObj = dfObj.max()

print('Maximum value in each column : ')
print(maxValuesObj)
Output:

Vim

Maximum value in each column :
x 66.0
y 36.0
z 23.0
dtype: float64
1
2
3
4
5
Maximum value in each column :
x 66.0
y 36.0
z 23.0
dtype: float64
It returned a series with column names as index label and maximum value of each column in values. Similarly we can find max value in every row too,

Get maximum values of every row
To find maximum value of every row in DataFrame just call the max() member function with DataFrame object with argument axis=1 i.e.

Python

Get a series containing maximum value of each row

maxValuesObj = dfObj.max(axis=1)
print('Maximum value in each row : ')
print(maxValuesObj)
1
2
3
4
5

Get a series containing maximum value of each row

maxValuesObj = dfObj.max(axis=1)

print('Maximum value in each row : ')
print(maxValuesObj)
Output:

Vim

Maximum value in each row :
a 23.0
b 33.0
c 44.0
d 55.0
e 66.0
dtype: float64
1
2
3
4
5
6
7
Maximum value in each row :
a 23.0
b 33.0
c 44.0
d 55.0
e 66.0
dtype: float64
It returned a series with row index label and maximum value of each row.

As we can see that it has skipped the NaN while finding the max value. We can include the NaN too if we want i.e.

Get maximum values of every column without skipping NaN
Python

Get a series containing maximum value of each column without skipping NaN

maxValuesObj = dfObj.max(skipna=False)
print('Maximum value in each column including NaN: ')
print(maxValuesObj)
1
2
3
4
5

Get a series containing maximum value of each column without skipping NaN

maxValuesObj = dfObj.max(skipna=False)

print('Maximum value in each column including NaN: ')
print(maxValuesObj)
output:

Vim

Maximum value in each column including NaN:
x 66.0
y NaN
z NaN
dtype: float64
1
2
3
4
5
Maximum value in each column including NaN:
x 66.0
y NaN
z NaN
dtype: float64
As we have passed the skipna=False in max() function, therefore it included the NaN to while searching for NaN. Also, if there is any NaN in the column then it will be considered as maximum value of that column.

Get maximum values of a single column or selected columns
To get the maximum value of a single column call the max() function by selecting single column from dataframe i.e.

Python

Get maximum value of a single column ‘y’

maxValue = dfObj[‘y’].max()
print("Maximum value in column ‘y’: " , maxValue)
1
2
3
4

Get maximum value of a single column ‘y’

maxValue = dfObj[‘y’].max()

print("Maximum value in column ‘y’: " , maxValue)
Output:

Vim

Maximum value in column ‘y’: 36.0
1
Maximum value in column ‘y’: 36.0
There is an another way too i.e.

Python

Get maximum value of a single column ‘y’

maxValue = dfObj.max()[‘y’]
1
2

Get maximum value of a single column ‘y’

maxValue = dfObj.max()[‘y’]
It will give the same result.

Instead of passing a single column name we can pass the list of column names too for selecting maximum value from that only i.e.

Python

Get maximum value of a single column ‘y’

maxValue = dfObj[[‘y’, ‘z’]].max()
print("Maximum value in column ‘y’ & ‘z’: ")
print(maxValue)
1
2
3
4
5

Get maximum value of a single column ‘y’

maxValue = dfObj[[‘y’, ‘z’]].max()

print("Maximum value in column ‘y’ & ‘z’: ")
print(maxValue)
Output:

Vim

Maximum value in column ‘y’ & ‘z’:
y 36.0
z 23.0
dtype: float64
1
2
3
4
Maximum value in column ‘y’ & ‘z’:
y 36.0
z 23.0
dtype: float64
Get row index label or position of maximum values of every column
DataFrame.idxmax()
We got the maximum value of each column or row, but what if we want to know the exact index position in every column or row where this maximum value exists ? To get the index of maximum value of elements in row and columns, pandas library provides a function i.e.

Python

DataFrame.idxmax(axis=0, skipna=True)
1
DataFrame.idxmax(axis=0, skipna=True)
Based on the value provided in axis it will return the index position of maximum value along rows and columns.
Let’s see how to use that

Get row index label of Maximum value in every column
Python

get the index position of max values in every column

maxValueIndexObj = dfObj.idxmax()
print(“Max values of columns are at row index position :”)
print(maxValueIndexObj)
1
2
3
4
5

get the index position of max values in every column

maxValueIndexObj = dfObj.idxmax()

print(“Max values of columns are at row index position :”)
print(maxValueIndexObj)
Output:

Vim

Max values of columns are at row index position :
x e
y e
z a
dtype: object
1
2
3
4
5
Max values of columns are at row index position :
x e
y e
z a
dtype: object
It’s a series containing the column names as index and row index labels where the maximum value exists in that column.

Get Column names of Maximum value in every row
Python

get the column name of max values in every row

maxValueIndexObj = dfObj.idxmax(axis=1)
print(“Max values of row are at following columns :”)
print(maxValueIndexObj)
1
2
3
4
5

get the column name of max values in every row

maxValueIndexObj = dfObj.idxmax(axis=1)

print(“Max values of row are at following columns :”)
print(maxValueIndexObj)
Output:

Vim

Max values of row are at following columns :
a z
b x
c x
d x
e x
dtype: object
1
2
3
4
5
6
7
Max values of row are at following columns :
a z
b x
c x
d x
e x
dtype: object
It’s a series containing the rows index labels as index and column names as values where the maximum value exists in that row.

Complete example is as follows,

Python

import pandas as pd
import numpy as np
def main():

List of Tuples

matrix = [(22, 16, 23),
(33, np.NaN, 11),
(44, 34, 11),
(55, 35, np.NaN),
(66, 36, 13)
]

Create a DataFrame object

dfObj = pd.DataFrame(matrix, index=list(‘abcde’), columns=list(‘xyz’))
print(‘Original Dataframe Contents :’)
print(dfObj)
print(’***** Get Maximum value in every column ***** ')

Get a series containing maximum value of each column

maxValuesObj = dfObj.max()
print('Maximum value in each column : ‘)
print(maxValuesObj)
print(’***** Get Maximum value in every row ***** ')

Get a series containing maximum value of each row

maxValuesObj = dfObj.max(axis=1)
print('Maximum value in each row : ‘)
print(maxValuesObj)
print(’***** Get Maximum value in every column without skipping NaN ***** ')

Get a series containing maximum value of each column without skipping NaN

maxValuesObj = dfObj.max(skipna=False)
print('Maximum value in each column including NaN: ‘)
print(maxValuesObj)
print(’***** Get Maximum value in a single column ***** ')

Get maximum value of a single column ‘y’

maxValue = dfObj[‘y’].max()
print("Maximum value in column ‘y’: " , maxValue)

Get maximum value of a single column ‘y’

maxValue = dfObj.max()[‘y’]
print("Maximum value in column ‘y’: " , maxValue)
print(’***** Get Maximum value in certain columns only ***** ')

Get maximum value of a single column ‘y’

maxValue = dfObj[[‘y’, ‘z’]].max()
print("Maximum value in column ‘y’ & ‘z’: ")
print(maxValue)
print(’***** Get row index label of Maximum value in every column *****’)

get the index position of max values in every column

maxValueIndexObj = dfObj.idxmax()
print(“Max values of columns are at row index position :”)
print(maxValueIndexObj)
print(’***** Get Column name of Maximum value in every row *****’)

get the column name of max values in every row

maxValueIndexObj = dfObj.idxmax(axis=1)
print(“Max values of row are at following columns :”)
print(maxValueIndexObj)
if name == ‘main’:
main()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
import pandas as pd
import numpy as np

def main():

List of Tuples

matrix = [(22, 16, 23),
(33, np.NaN, 11),
(44, 34, 11),
(55, 35, np.NaN),
(66, 36, 13)
]

Create a DataFrame object

dfObj = pd.DataFrame(matrix, index=list(‘abcde’), columns=list(‘xyz’))

print(‘Original Dataframe Contents :’)
print(dfObj)

print(’***** Get Maximum value in every column ***** ')

Get a series containing maximum value of each column

maxValuesObj = dfObj.max()

print('Maximum value in each column : ')
print(maxValuesObj)

print(’***** Get Maximum value in every row ***** ')

Get a series containing maximum value of each row

maxValuesObj = dfObj.max(axis=1)

print('Maximum value in each row : ')
print(maxValuesObj)

print(’***** Get Maximum value in every column without skipping NaN ***** ')

Get a series containing maximum value of each column without skipping NaN

maxValuesObj = dfObj.max(skipna=False)

print('Maximum value in each column including NaN: ')
print(maxValuesObj)

print(’***** Get Maximum value in a single column ***** ')

Get maximum value of a single column ‘y’

maxValue = dfObj[‘y’].max()

print("Maximum value in column ‘y’: " , maxValue)

Get maximum value of a single column ‘y’

maxValue = dfObj.max()[‘y’]

print("Maximum value in column ‘y’: " , maxValue)

print(’***** Get Maximum value in certain columns only ***** ')

Get maximum value of a single column ‘y’

maxValue = dfObj[[‘y’, ‘z’]].max()

print("Maximum value in column ‘y’ & ‘z’: ")
print(maxValue)

print(’***** Get row index label of Maximum value in every column *****’)

get the index position of max values in every column

maxValueIndexObj = dfObj.idxmax()

print(“Max values of columns are at row index position :”)
print(maxValueIndexObj)

print(’***** Get Column name of Maximum value in every row *****’)

get the column name of max values in every row

maxValueIndexObj = dfObj.idxmax(axis=1)

print(“Max values of row are at following columns :”)
print(maxValueIndexObj)

if name == ‘main’:
main()
Output:

Vim

Original Dataframe Contents :
x y z
a 22 16.0 23.0
b 33 NaN 11.0
c 44 34.0 11.0
d 55 35.0 NaN
e 66 36.0 13.0
***** Get Maximum value in every column *****
Maximum value in each column :
x 66.0
y 36.0
z 23.0
dtype: float64
***** Get Maximum value in every row *****
Maximum value in each row :
a 23.0
b 33.0
c 44.0
d 55.0
e 66.0
dtype: float64
***** Get Maximum value in every column without skipping NaN *****
Maximum value in each column including NaN:
x 66.0
C:\Users\varun\AppData\Local\Programs\Python\Python37-32\lib\site-packages\numpy\core_methods.py:28: RuntimeWarning: invalid value encountered in reduce
y NaN
z NaN
dtype: float64
return umr_maximum(a, axis, None, out, keepdims, initial)
***** Get Maximum value in a single column *****
Maximum value in column ‘y’: 36.0
Maximum value in column ‘y’: 36.0
***** Get Maximum value in certain columns only *****
Maximum value in column ‘y’ & ‘z’:
y 36.0
z 23.0
dtype: float64
***** Get row index label of Maximum value in every column *****
Max values of columns are at row index position :
x e
y e
z a
dtype: object
***** Get Column name of Maximum value in every row *****
Max values of row are at following columns :
a z
b x
c x
d x
e x
dtype: object
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
Original Dataframe Contents :
x y z
a 22 16.0 23.0
b 33 NaN 11.0
c 44 34.0 11.0
d 55 35.0 NaN
e 66 36.0 13.0
***** Get Maximum value in every column *****
Maximum value in each column :
x 66.0
y 36.0
z 23.0
dtype: float64
***** Get Maximum value in every row *****
Maximum value in each row :
a 23.0
b 33.0
c 44.0
d 55.0
e 66.0
dtype: float64
***** Get Maximum value in every column without skipping NaN *****
Maximum value in each column including NaN:
x 66.0
C:\Users\varun\AppData\Local\Programs\Python\Python37-32\lib\site-packages\numpy\core_methods.py:28: RuntimeWarning: invalid value encountered in reduce
y NaN
z NaN
dtype: float64
return umr_maximum(a, axis, None, out, keepdims, initial)
***** Get Maximum value in a single column *****
Maximum value in column ‘y’: 36.0
Maximum value in column ‘y’: 36.0
***** Get Maximum value in certain columns only *****
Maximum value in column ‘y’ & ‘z’:
y 36.0
z 23.0
dtype: float64
***** Get row index label of Maximum value in every column *****
Max values of columns are at row index position :
x e
y e
z a
dtype: object
***** Get Column name of Maximum value in every row *****
Max values of row are at following columns :
a z
b x
c x
d x
e x
dtype: object

发布了209 篇原创文章 · 获赞 18 · 访问量 39万+

猜你喜欢

转载自blog.csdn.net/llrraa2010/article/details/92905460