Add list as a column to a dataframe

giorgio-p :

Suppose I have the following lists:

cond_1 = [1,2]
cond_2 = [3,5]

And also the following dataframe df:

|----------|
| Column_1 |
|----------|
|     x    |
|----------|
|     y    |
|----------|
|     y    |
|----------|
|     x    |
|----------|

What I want to do, is to add a second column Column_2. following these criteria:

1) if Column_1 contains a x, add a value in Column_2 from cond_1;

2) if Column_1 contains a y, add a value in Column_2 from cond_2

The desired output should be like this:

|----------|----------|
| Column_1 | Column_2 |
|----------|----------|
|     x    |     1    |
|----------|----------|
|     y    |     3    |
|----------|----------|
|     y    |     5    |
|----------|----------|
|     x    |     2    |
|----------|----------|

I have been trying to do this using pd.Series:

df_x = df.loc[df['Column_1'] == "x"] #first I create a dataframe only with the x values
df_x['Column_2'] = pd.Series(cond_1)

Then I would repeat the same thing for the y values, obtaining df_y.

However, this doesn't succeed. Then, I would need to append again the two dataframes (df_x and df_y) and I lose information on the original index that I want to maintain from df.

Jon Clements :

You can create a helper class and use it in an .apply, eg:

class ReplaceWithNext:
    def __init__(self, **kwargs):
        self.lookup = {k: iter(v) for k, v in kwargs.items()}
    def __call__(self, value):
        return next(self.lookup[value])

Then use it as:

df['Column_2' ] = df['Column_1'].apply(ReplaceWithNext(x=cond_1, y=cond_2))

Which'll give you:

  Column_1  Column_2
0        x         1
1        y         3
2        y         5
3        x         2

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=386569&siteId=1