DuckSaucer :
I am currently starting to get into pandas and I was wondering, if there is a function giving me common categories of items in a DataFrame. To visualize:
These are the data I have (highly simplified example, obviously)
Discipline Person
0 football Alanis
1 football Bernard
2 football Delilah
3 basketball Charlie
4 basketball Delilah
5 tennis Charlie
And I'd like to find out, which two people share a discipline, ideally in the form of a matrix like so:
Alanis Bernard Charlie Delilah
Alanis True True False True
Bernard True True False True
Charlie False False True True
Delilah True True True True
Alternatively, it could be a function returning a list of common categories.
I don't even know if pandas is the best tool for a task like this one (probably not), as I said, I'm quite a noob still. I do appreciate your help though.
Thanks!
yatu :
One approach could be to build a network, and obtain from it the adjacency matrix:
import networkx as nx
from itertools import combinations, chain
L = df.groupby('Discipline').Person.agg(list)
G = nx.Graph()
L = [list(combinations(i,2)) for i in L.values.tolist()]
G.add_edges_from(chain.from_iterable(L))
nx.to_pandas_adjacency(G, nodelist=sorted(G.nodes())).astype(bool)
Alanis Bernard Charlie Delilah
Alanis False True False True
Bernard True False False True
Charlie False False False True
Delilah True True True False
If you want the diagonal values set to True, you could just add:
import numpy as np
out[:] = out.values + np.eye(out.shape[1], dtype=bool)
print(out)
Alanis Bernard Charlie Delilah
Alanis True True False True
Bernard True True False True
Charlie False False True True
Delilah True True True True
Guess you like
Origin http://10.200.1.11:23101/article/api/json?id=398838&siteId=1