How to remove certain values from a pandas dataframe, which are not in a list?

Usman Syed :

By writing the following code I create a dataframe

data = [['A', 'B','D'], ['A','D'], ['F', 'G','C','B','A']] 
df = pd.DataFrame(data) 
df

enter image description here

My goal is to remove the values from the dataframe that are not in the list below.

list_items = ['A','B','C']

My expected output is as under

enter image description here

I have tried traversing the values in loops and check one by one, but let's say the dataframe is very large in size (9108, 1616) and the list has over 130 items that need to be checked. In that case it's taking too long to run the code. Please suggest the most efficient way to achieve the expected output.

fmarm :

I don't think doing it in pandas is a good ideas as columns don't matter here. It's easier to do it with lists, that you can convert to a pandas dataframe in the end if you really need it.

# convert df to list of lists
data = df.values.tolist()
# filter each element of the list to contain only list_items values
data_filtered = [ [el for el in l if el in list_items] for l in data]
# convert back to dataframe
df_filtered = pd.DataFrame(data_filtered) 
print(df_filtered)
#   0   1    2
#0  A   B    None
#1  A   None None
#2  C   B    A

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=408024&siteId=1