Python | Count words in dataframe that are matching a prespecified list of words

Geomariachi :

I'm trying to count the words in a Dataframe column consisting of speeches. I have created a lists with words associated with different themes, for example:

Care = [safe, peace, compassion, empath, care, caring, protect, shield, shelter]

Now i would like to count how many times, in total, words in the "Care" list occur in each speech, and then add a new column at the end of the df with the count of each row.

I'm using this code right now.

df = df.assign(Care=df['speech'].str.count('|'.join(care)))

But im suspecting that it gives me partial matches aswell. I would like to only get a match when the words match the whole word in my list. Any ideas?

Sajan :

Assuming that the speech is free of punctuation marks, this might work -

df['count'] = df['speech'].apply(lambda x: len([val for val in x.split() if val in Care]))

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=390542&siteId=1