Checking one column and returning another column, in a Pandas Dataframe

yodish :

I have a dataframe like this:

   Title                Participants
0  ShowA            B. Smith,C. Ball
1  ShowB                   T. Smooth
2  ShowC  K. Dulls,L. Allen,B. Smith

I'm splitting on , in the Participants column and creating a list for each cell. Next, I check for specific participant(s) in each list. In this example, I'm checking for either B. Smith or K. Dulls

for item in df['Participants']:
    listX = item.split(',')
    if 'B. Smith' in listX or 'K. Dulls' in listX:
        print(listX)

This returns:

['B. Smith', 'C. Ball']
['K. Dulls', 'L. Allen', 'B. Smith']

1) I'm guessing there is a cleaner way to check for multiple participants, in my if statement. I'd love any suggestions.

2) This is where i've been spinning in circles, how do I return the Title associated with the list(s) I return?

In this example, i'd like to return:

ShowA
ShowC

Setup code:

import pandas as pd

df = pd.DataFrame(data={'Title': ['ShowA', 'ShowB', 'ShowC'],
                        'Participants': ['B. Smith,C. Ball', 'T. Smooth', 'K. Dulls,L. Allen,B. Smith']})

target_participants = ['B. Smith', 'K. Dulls']
piRSquared :

get_dummies

You can use pandas.Series.str.get_dummies and create a dataframe where columns are boolean expressions of where names are present.

dummies = df.Participants.str.get_dummies(',').astype(bool)
dummies

   B. Smith  C. Ball  K. Dulls  L. Allen  T. Smooth
0      True     True     False     False      False
1     False    False     False     False       True
2      True    False      True      True      False

Then we can find your result

df.loc[dummies['B. Smith'] | dummies['K. Dulls'], 'Title']

0    ShowA
2    ShowC
Name: Title, dtype: object

contains

Otherwise, you can use pandas.Series.str.contains. First we'll need to specify the people you are looking for in a list and then construct a string to use as a regular expression.

people_to_look_for = ['B. Smith', 'K. Dulls']
pattern = '|'.join(people_to_look_for)
mask = df.Participants.str.contains(pattern)
df.loc[mask, 'Title']

0    ShowA
2    ShowC
Name: Title, dtype: object

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=5304&siteId=1