Jailbone :
I have a data looking like this:
Col1
aaa1
bbb1
ccc1
1
2
3
aaa2
bbb2
ccc2
4
5
6
I want to add a new column to make it look like this:
Col1 Col2
aaa1 1
bbb1 1
ccc1 1
1 1
2 1
3 1
aaa2 2
bbb2 2
ccc2 2
4 2
5 2
6 2
So every time a row with aaa1, aaa2... is reached the number in the row of the new column should increase by one. Furthermore I want then to delete all rows of the form aaa, bbb and ccc so the result looks like this:
Col1 Col2
1 1
2 1
3 1
4 2
5 2
6 2
I appreciate your help!
anky_91 :
IIUC, you can check if str.contains
aaa
and then cumsum
for Col2
, then check using pd.to_numeric
if value is numeric in Col1
and remove which is not:
df['Col2'] = df['Col1'].str.contains('aaa').cumsum()
out = df[pd.to_numeric(df['Col1'],errors='coerce').notna()]
print(out)
Col1 Col2
3 1 1
4 2 1
5 3 1
9 4 2
10 5 2
11 6 2