[标题]
Text Level Graph Neural Network for Text Classification
[ Code address ]
https://github.com/LindgeW/TextLevelGNN
(The pytorch code reproduced by the boss, but the effect is not good)
[ Knowledge Reserve ]
What is TextGCN?
table of Contents
1. Background and overview
1.1 Related research
- CNN
- RNN
- GNN
- Build a picture on the entire corpus, the picture contains all word nodes
- Large sliding window -> large memory
- The weight of the edge is fixed, indicating poor ability
- Model structure and parameters are too dependent on corpus
1.2 Contribution points
- Create a separate picture for each article
- Small sliding window -> small memory
- The same node representation, and the weights of the edges are globally shared, and updated
- Independent of corpus
1.3 Related work
Almost in related research
Two, the model
2.1 Build a map
For a text
- l is its length
- N is all word nodes of the graph
- E is all the edges of the graph, that is, node i is connected with other neighboring nodes in a window of 2p+1 size centered on him.
- Both N and E come from the shared global matrix
- An edge that appears less than k times is a common edge
2.2 MPM messaging mechanism
For the nth word
- (3): 2p+1 neighboring nodes (including itself)* and its edge weight, which is the largest in each dimension
- (4): A learnable parameter to determine how many original vector representations (N) to retain, and how many updated vector representations (M) to retain, complete the update
2.3 training object
For a sentence, sum all its word vectors, and finally map to the output dimension, using cross entropy loss function
Three, experiment and evaluation
- window size/p = 2
- lr = 1e-3
- L2 weight decay = 1e-4
- dropout_p = 0.5
- bs = 32
- early stop = 10
- glove word embeddings
4. Conclusion and personal summary
Conclusion of the paper
- window size/p = 3 works best
- In the case of window size/p = 2(?), such as 20 in textgcn, the memory is small
- The weight of the edge is set to 20 using a fixed PMI+window size, and the effect is not good
- The max pooling of a single sentence is replaced by mean pooling, but the effect is not good
- Randomly initialize the vector representation of all word nodes, the effect is not good
Personal summary
- Like TextING, it can also be batch processed
- Is it the same as word embedding, need to establish edge embedding, that is, every unique eage has a unique idx
Five, reference
no
Six, expansion
The following understanding and improvement ideas for https://github.com/LindgeW/TextLevelGNN .