Python Pandas: Search for substring in entire dataframe then output the name of the column(s) where the substring was found

minnymate :
key_words_to_search = ['hello', 'goodbye']  
df = pd.DataFrame({
'col1':['hello','hi','ciao'],
'col2':['hello panda','goodbye','bonjour'],
'col3':['ni hao','hola','hello']})

I've been using something like the below, but not sure how to get the actual name of the column. Thanks!

mask = df.applymap(lambda x: word in str(word).lower())
temp = df[mask.any(axis=1)].copy() 

Tabular visualization of the data frame

YOLO :

Here's a way of doing:

d = []

for k in key_words_to_search:
    print(k)
    i = df.applymap(lambda x: k in x)
    i = i.astype(int).mask(i, i.columns.to_series(), axis=1).astype(str).agg(lambda x: ','.join(i for i in x if not i.isdigit()), 1)
    d.append(i)

df[['hello','goodbye']] = pd.concat(d, axis=1)

print(df)

          col1         col2    col3      hello goodbye
0        hello  hello panda  ni hao  col1,col2        
1  hello panda      goodbye    hola       col1    col2
2       ni hao      goodbye   hello       col3    col2

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=8086&siteId=1