Adjacency matrix using grouping column

DuckSaucer :

I am currently starting to get into pandas and I was wondering, if there is a function giving me common categories of items in a DataFrame. To visualize:

These are the data I have (highly simplified example, obviously)

   Discipline   Person
0    football   Alanis
1    football  Bernard
2    football  Delilah
3  basketball  Charlie
4  basketball  Delilah
5      tennis  Charlie

And I'd like to find out, which two people share a discipline, ideally in the form of a matrix like so:

        Alanis  Bernard Charlie Delilah
Alanis  True    True    False   True    
Bernard True    True    False   True
Charlie False   False   True    True
Delilah True    True    True    True    

Alternatively, it could be a function returning a list of common categories.
I don't even know if pandas is the best tool for a task like this one (probably not), as I said, I'm quite a noob still. I do appreciate your help though. Thanks!

yatu :

One approach could be to build a network, and obtain from it the adjacency matrix:

import networkx as nx
from itertools import combinations, chain

L = df.groupby('Discipline').Person.agg(list)

G = nx.Graph()
L = [list(combinations(i,2)) for i in L.values.tolist()]
G.add_edges_from(chain.from_iterable(L))

nx.to_pandas_adjacency(G, nodelist=sorted(G.nodes())).astype(bool)

          Alanis  Bernard  Charlie  Delilah
Alanis    False     True    False     True
Bernard    True    False    False     True
Charlie   False    False    False     True
Delilah    True     True     True    False

If you want the diagonal values set to True, you could just add:

import numpy as np
out[:] = out.values + np.eye(out.shape[1], dtype=bool)

print(out)

         Alanis  Bernard  Charlie  Delilah
Alanis     True     True    False     True
Bernard    True     True    False     True
Charlie   False    False     True     True
Delilah    True     True     True     True

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=398838&siteId=1