Python Pandas Conditional Calculation row by row

fleshstorm :

I have a dataset with values in a column A representing a score. I want to calculate a column B where I can see for how long the score is equal or higher than in the last row. If the value is lower then I will set the value in column B for this row to 0.

I tried the following:

df = pd.DataFrame({'A': [140, 145,148, 150, 100, 105, 106]})

df['B'] = 0
df.loc[df['A'].diff() >= 0, 'B'] = df['B'].shift() + 1

The result is the following

    A       B
0   140     0.0
1   145     1.0
2   148     1.0
3   150     1.0
4   100     0.0
5   105     1.0
6   106     1.0

So I understand that the condition is checked for each row - but unfortunately it seems to be calculated all together, so that it does not increment value B as expected because at the time of calculation for each row the value of the row-1 .shift() is still 0.

How do I have to do it to get the following result?:

    A       B
0   140     0
1   145     1
2   148     2
3   150     3
4   100     0
5   105     1
6   106     2
Quang Hoang :

You need cumsum to identify the blocks, and groupby().cumcount() to enumerate each the block:

s = df['A'].diff().ge(0)
df['B'] = s.groupby((~s).cumsum()).cumcount()

Output:

     A  B
0  140  0
1  145  1
2  148  2
3  150  3
4  100  0
5  105  1
6  106  2

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=16480&siteId=1
Row