Text Level Graph Neural Network for Text Classification reading notes

[标题]
Text Level Graph Neural Network for Text Classification

[ Code address ]
https://github.com/LindgeW/TextLevelGNN
(The pytorch code reproduced by the boss, but the effect is not good)

[ Knowledge Reserve ]
What is TextGCN?

1. Background and overview

1.1 Related research

  • CNN
  • RNN
  • GNN
    • Build a picture on the entire corpus, the picture contains all word nodes
    • Large sliding window -> large memory
    • The weight of the edge is fixed, indicating poor ability
    • Model structure and parameters are too dependent on corpus

1.2 Contribution points

  • Create a separate picture for each article
  • Small sliding window -> small memory
  • The same node representation, and the weights of the edges are globally shared, and updated
  • Independent of corpus

1.3 Related work

Almost in related research

Two, the model

Insert picture description here

2.1 Build a map

Insert picture description hereFor a text

  • l is its length
  • N is all word nodes of the graph
  • E is all the edges of the graph, that is, node i is connected with other neighboring nodes in a window of 2p+1 size centered on him.
  • Both N and E come from the shared global matrix
  • An edge that appears less than k times is a common edge

2.2 MPM messaging mechanism

Insert picture description here
For the nth word

  • (3): 2p+1 neighboring nodes (including itself)* and its edge weight, which is the largest in each dimension
  • (4): A learnable parameter to determine how many original vector representations (N) to retain, and how many updated vector representations (M) to retain, complete the update

2.3 training object

Insert picture description here
For a sentence, sum all its word vectors, and finally map to the output dimension, using cross entropy loss function

Three, experiment and evaluation

  • window size/p = 2
  • lr = 1e-3
  • L2 weight decay = 1e-4
  • dropout_p = 0.5
  • bs = 32
  • early stop = 10
  • glove word embeddings
    Insert picture description here

4. Conclusion and personal summary

Conclusion of the paper

  • window size/p = 3 works best
  • In the case of window size/p = 2(?), such as 20 in textgcn, the memory is small
  • The weight of the edge is set to 20 using a fixed PMI+window size, and the effect is not good
  • The max pooling of a single sentence is replaced by mean pooling, but the effect is not good
  • Randomly initialize the vector representation of all word nodes, the effect is not good

Personal summary

  • Like TextING, it can also be batch processed
  • Is it the same as word embedding, need to establish edge embedding, that is, every unique eage has a unique idx

Five, reference

no

Six, expansion

The following understanding and improvement ideas for https://github.com/LindgeW/TextLevelGNN .

Guess you like

Origin blog.csdn.net/jokerxsy/article/details/113789860