Graph Convolutional Networks for Text Classification original code interpretation [pytorch]

Preface

Ah, I looked at the original code of pot tensorflow before, and I also remembered the original code interpretation of Graph Convolutional Networks for Text Classification [tensorflow]

project address

https://github.com/iworldtong/text_gcn.pytorch

Environment configuration

Just in the environment of the tensorflow version, pytorch 1.7.1+cu101 was reinstalled

Code analysis

remove_words.py

Is the original code

build_graph.py

It is also the original code.
Curious: Seeing that scipy's csr_matrix function is also used, does pytorch also have similar matrix operations?

train.py

Some dimensions & some output

adj (61603, 61603)
features (61603, 300)
y_train (61603, 20)
y_val (61603, 20)
y_test (61603, 20)
train_mask (61603,)
val_mask (61603,)
test_mask (61603,)
train_size 11314
test_size 7532
tm_train_mask torch.Size([61603, 20])
t_support[0] torch.Size([61603, 61603])
pre_sup torch.Size([61603, 200])
support0 torch.Size([61603, 61603])
out torch.Size([61603, 200])
logits * tm_train_mask[0] : tensor([ 0.0521, -0.0080,  0.2177, -0.1337,  0.1672, -0.0428,  0.0664, -0.1221,
         0.0376,  0.0709, -0.3589,  0.2038,  0.0118, -0.1365, -0.2384, -0.1432,
         0.0838,  0.1781,  0.2771,  0.1930], grad_fn=<SelectBackward>)
t_y_train[0]:tensor([0., 0., 0., 0., 0., 0., 0., 0., 1., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
        0., 0.], dtype=torch.float64)
torch.max(t_y_train, 1)[0]:tensor(8)

features = preprocess_features(features)

# return sparse_to_tuple(features)
return features.A

Removed the sparse_to_tuple function

def sparse_to_tuple(sparse_mx):
    """Convert sparse matrix to tuple representation."""
    def to_tuple(mx):
        if not sp.isspmatrix_coo(mx):
            mx = mx.tocoo()
        coords = np.vstack((mx.row, mx.col)).transpose()
        values = mx.data
        shape = mx.shape
        return coords, values, shape

    if isinstance(sparse_mx, list):
        for i in range(len(sparse_mx)):
            sparse_mx[i] = to_tuple(sparse_mx[i])
    else:
        sparse_mx = to_tuple(sparse_mx)

    return sparse_mx

preprocess_adj(adj)

The same is true

# return sparse_to_tuple(adj_normalized)
return adj_normalized.A

tm_train_mask = torch.transpose(torch.unsqueeze(t_train_mask, 0), 1, 0).repeat(1, y_train.shape[1])

Convert the original (real_train_size+valid_size+vocab_size+test_size,) vector into a (real_train_size+valid_size+vocab_size+test_size, number of labels) tensor

training

Ah, omitted.

Evaluation

from sklearn import metrics
print_log("Test Precision, Recall and F1-Score...")
print_log(metrics.classification_report(test_labels, test_pred, digits=4))
print_log("Macro average Test Precision, Recall and F1-Score...")
print_log(metrics.precision_recall_fscore_support(test_labels, test_pred, average='macro'))
print_log("Micro average Test Precision, Recall and F1-Score...")
print_log(metrics.precision_recall_fscore_support(test_labels, test_pred, average='micro'))

Build your own data set-wiki80

Choose 80/10/10 wiki_727K to play:
Insert picture description here

adj (7002, 7002)
features (7002, 300)
y_train (7002, 2)
y_val (7002, 2)
y_test (7002, 2)
train_mask (7002,)
val_mask (7002,)
test_mask (7002,)
train_size 3927
test_size 352

Insert picture description here

Build your own data set-wiki800

Insert picture description here

adj (109772, 109772)
features (109772, 300)
y_train (109772, 2)
y_val (109772, 2)
y_test (109772, 2)
train_mask (109772,)
val_mask (109772,)
test_mask (109772,)
train_size 83497
test_size 4564

Insert picture description here

Build your own data set-wiki8000

Insert picture description here

Guess you like

Origin blog.csdn.net/jokerxsy/article/details/112853808