giorgio-p :
Suppose I have the following lists:
cond_1 = [1,2]
cond_2 = [3,5]
And also the following dataframe df
:
|----------|
| Column_1 |
|----------|
| x |
|----------|
| y |
|----------|
| y |
|----------|
| x |
|----------|
What I want to do, is to add a second column Column_2
. following these criteria:
1) if Column_1
contains a x
, add a value in Column_2
from cond_1
;
2) if Column_1
contains a y
, add a value in Column_2
from cond_2
The desired output should be like this:
|----------|----------|
| Column_1 | Column_2 |
|----------|----------|
| x | 1 |
|----------|----------|
| y | 3 |
|----------|----------|
| y | 5 |
|----------|----------|
| x | 2 |
|----------|----------|
I have been trying to do this using pd.Series
:
df_x = df.loc[df['Column_1'] == "x"] #first I create a dataframe only with the x values
df_x['Column_2'] = pd.Series(cond_1)
Then I would repeat the same thing for the y
values, obtaining df_y
.
However, this doesn't succeed. Then, I would need to append again the two dataframes (df_x
and df_y
) and I lose information on the original index that I want to maintain from df
.
Jon Clements :
You can create a helper class and use it in an .apply
, eg:
class ReplaceWithNext:
def __init__(self, **kwargs):
self.lookup = {k: iter(v) for k, v in kwargs.items()}
def __call__(self, value):
return next(self.lookup[value])
Then use it as:
df['Column_2' ] = df['Column_1'].apply(ReplaceWithNext(x=cond_1, y=cond_2))
Which'll give you:
Column_1 Column_2
0 x 1
1 y 3
2 y 5
3 x 2
Guess you like
Origin http://10.200.1.11:23101/article/api/json?id=386569&siteId=1