[Read the paper] Neural Pinyin-to-Chinese Character Converter - Code World

[Read the paper] Neural Pinyin-to-Chinese Character Converter

Others 2020-03-24 17:09:34 views: null

https://github.com/Kyubyong/neural_chinese_transliterator

Use seq2seq manner, the phonetic sequence into a sequence of Chinese characters, the model structure:

1. Prepare the training data

zho_news_2007-2009_1M-sentences.txt, 100w, word, word does not actually use the information

1 blog Park

2. Construction of Pinyin-Chinese parallel corpus, zh.tsv, p [char] + [ "_"] * (len (p) - 1)

1 bokeyuan Bo Garden __ _ _ off _

3. Generate dictionary, save the file as pkl

pnyn2idx, idx2pnyn, hanzi2idx, idx2hanzi

4. Trainer

Reading the dictionary, the training set (x = [pinyin ids], y = [hanzi ids])

x-> model-> y and calculating cross-entropy loss

5. forecast

The overall look is more reliable predictions, some word, homophone errors will predict the word training corpus than Ng Wu, armed police will be more, could not explain why xinjiangwujing Wu Xinjiang police prediction error
Some spelling errors in case, if the original spelling does not exist, it will generate a candidate is quite similar spelling wuo-> duo, the correct spelling is also likely to generate similar candidate jiao-> tiao
And other proper nouns or phrase is not very good

Advantages: End2End model structure, do not require extensive vocabularies, artificial features, etc., as long as the Chinese corpus parallel corpus can be obtained, disadvantages: not explanatory deep learning model to study some of the context is not well.

Question: If the expansion, adjust the training corpus, I do not know whether the model the effect of a possible industrial application.

Guess you like

Origin www.cnblogs.com/AliceYing/p/12559949.html

[Read the paper] Neural Pinyin-to-Chinese Character Converter

[Lesen Sie das Papier] Neural Pinyin-to-Chinese Character Converter

Read the paper notes: Glyce: Glyph-vectors for Chinese Character Representations

ICLR 2019 paper read: quantify neural network

[Leia o jornal] Neural Pinyin-se Chinese Character Converter

Convert Chinese character names to pinyin

Read the paper

Java converts Chinese characters to pinyin and judges whether a character is a Chinese character

[Read the paper notes] A Sensitivity Analysis of (and Practitioner Guide to) Convolutional Neural Networks for Sent

[Notes] paper read | One shot learning with memory-augmented neural networks

Read the paper "GAPNet: Graph Attention-based Point Neural Network for Exploiting Local Feature of Point Cloud

Paper: Show, Attend and Tell: Neural Image Caption Generation with Visual Attention- read summary

Re43: Read the paper DNS Deep Neural Solver for Math Word Problems

[Read the paper] [Quick reading] Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields

A Chinese character code is converted into the python library Pinyin

MySQL Chinese character field pinyin sort

[Read the paper] RoCBert: Robust Chinese Bert with Multimodal Contrastive Pretraining

Read the paper notes sixty-four: Architectures for deep neural network based acoustic models defined over windowed speech waveforms (INTERSPEECH 2015)

Third, read the paper

Fifth, read the paper

How to efficiently read the paper

Read the paper papers reading

【Thesis】Read the paper

How to read a reseach paper

How to read a reseach paper

【Read the paper】AttentionFGAN

[Read the paper] TCPMFNet

【Read the paper】GANMc

Chinese short text classification Four examples -charCNN-kim (Character-Aware Neural Language Models)

Character input stream [Reader] Read character data

Recommended

Ranking

Empire cms smart tag calls four first-level recommended articles, starting from the fourth article

Linux environment installation and configuration Elasticsearch7.17

Big Data processing architecture and Lambda Kappa architecture

Explore the top of the AI large model platform - Wenxin Qianfan

Beijing car PK10 lucky airship Guanya size and value of the odd and even tips

W3B x Sui Hacker House｜In-depth understanding of Sui and Move language

Know almost Ko Chan: Chinese what any decent open source software products? (Finishing from my original answer)

Comprehensively improve AD domain security authentication | Zhuyun IDaaS

Android Update Engine Analysis (24) What happened when making the downgrade package?

Spark Architecture and Operating Mechanism (1) - System Architecture

Daily

More

2024-05-07(34)

2024-05-06(6)

2024-05-05(0)

2024-05-04(18)

2024-05-03(8)

2024-05-02(0)

2024-05-01(4)

2024-04-30(36)

2024-04-29(5)

2024-04-28(12)