Advanced deep learning - reading notes

Advanced deep learning - reading notes

  1. Image Processing

1.1 Migration style

describes how a pictorial diagram: the Texture representation (texture represented); feature-map form a characterization of content

l how to weigh the content and style:

Image Retrieval 1.2

l content-based image retrieval: Information retrieval based on color images, textures and category picture

l hash-based image retrieval architecture

image feature represented: Hand CNN a crafted based Features Features →

Hash Coding learning - Method category: a hash coding concepts (high-dimensional to low-dimensional representation) b hash coding advantage (reducing memory, increase speed) c two hash coding phase (learning phase,... coding stage)

the depth of the hash coding based supervised learning: image feature extraction layer + layer learn hash coding

face multi-label image retrieval hash coding depth supervised learning: an input tuple, the CONV, FC generates a feature vector, generated by encoding image hash hash coding layer study, the loss of contrast in the multi-level guidance function Hamming distance corresponding to semantic similarity

1.3 title generation

l What is the image generated title: input image, text description of the picture output

image title generation - the simplest version of the encoder-decoder: encoder extracting a feature from the CNN, decoder description generated by the RNN

picture header generating -MS Captivator: detect words → generate sentences → re-rank sentences

image title generation - based attention model:. a B contexts obtained by the CONV LSTM the contexts generated by the word.

  1. Natural Language Processing

2.1 Technical Overview:

NLP Overview: the NLP technology → NLP core base technology → NLP +

The concept of term vectors l: natural language to a machine understood symbol medium

application of word vectors: computing similarity, as the input of the neural network, the sentence / document representation

word vector learning model - neural network language model: You can determine the probability of a string of natural language

word vector learning models -CBOW and skip-grain: a.CBOW Model: a context word by itself as a predictive model input b.Skip-grain words: one word as input word prediction its context

word vector learning model - hierarchical softmax: one for the output layer optimization strategies, calculate the probability value output layer Huffman tree

word vector learning model - negative sampling method: to maximize the probability of positive samples, while minimizing the probability of negative samples

2.2 Sentiment Analysis

l sentiment analysis and artificial intelligence

Construction of emotional knowledge base: sentiment analysis technology system applied research → → sentiment classification model of sentiment analysis

l emotion vector word meaning: a similar syntax and semantics of the word, the word short distance in the vector space

l emotion word vector learning models: the introduction of a sentence as the supervision and guidance of emotional information word vector model

chapter level sentiment classification models: analysis and judgment (term global sentiment polarity of the entire document → sentences → chapter)

sentence level sentiment classification model: the emotional polar monomer sentence classification determination (common CNN, RNN, Recursve-NN, BERT)

l attribute level sentiment classification model: the attribute of the thing described sentiment polarity judgment (fine-grained emotion analysis), two types of methods (segmented represented, indicated as a whole)

2.3 reading machine

What is the machine to read: . A human instead of the AI automatically read information and they have to answer questions b is the field of NLP "crown jewel", involving complex technologies such as semantic understanding.

L difficulty reading machine challenges: semantic reasoning difficult, difficult semantic association, semantic representation difficult

machine-readable data set -MCTest

machine-readable data set -CNN / Daily Mail

machine-readable data set -SQuAD

machine-readable data set -Quasar-T

machine-readable model ( BiDAF): Enter a question and an article X Y, the output of each word in the article as the answer probability beginning and the end of the answer (Bi-Directional Attention Flow For Machine Comprehension)

The main steps l machine-readable: text representation, semantic matching, understand the reasoning, the result of recommendation

2.4 QA

What is the question answering system: . A is considered to be the original form of the Turing Test b is the basic form of the next generation of search engines.

l question answering system based Knowledge Mapping

based on mapping knowledge quiz - depth learning methods: three key issues (representation of the problem, the association between the semantic representation of the answers, questions and answers)

text, depth of knowledge representation: ... A vectorized vectorized c b word sentence (text) to knowledge (facts, propositions) quantification

Q Model Knowledge-based map: determining entity body → → entity generates answer candidate answer questions represent → → calculates a score represents

l question answering system based reasoning: reasoning by known knowledge to get knowledge unknown

l Attentive Reader: LSTM were used to model two-way document and query

  1. Multimodal fusion

Mode classification than 3.1

l What is multimodal data: a means of communication via text, voice, images, video, or other resource model composed message.

l What is a multi-modal Sentiment Analysis: The information on single mode often with incomplete or ambiguous, multi-modal data to form a single-mode multi-angle supplementary data

traditional multimodal fusion: by a combination of a plurality of multimodal fusion learner, the learner separate said individual learning, which can be respectively set as text, images, voice and other single view classifier. Individual learning can be a SVM, decision tree, NN and other learning algorithms.

ensemble learning under what circumstances effective: individual learner should be "good and different" to have a certain accuracy and differences

based on the depth learning multimodal sentiment classification: . A fusion based on two key points multimodal classification model (how effectively single modality sentiment classification, how a combination of a plurality of emotional classification monomodal results) b. training pictures classifier using transfer learning ideas

before fusion how: .. A pre-fusion means a semantic relation between the different learning modality data, feature extraction b joint fusion and fusion is an independent feature extraction process

from the encoder ( what AutoEncoder) is:. a feed forward neural network is a precursor, the goal is to make possible the use of input and output uniform b backpropagation training, unsupervised model, for data reduction or features. extract

from the encoder principle: an encoder and a decoder comprising a plurality of layers codec performance better codec b c.AutoEncoder process minimizes the objective function of the difference between input and output.

What is self-encoding sparse: . A sparse from the encoder (Sparse AutoEncoder) can be expressed as a sparse intermediate constraints, to study a more useful feature b AutoEncoder L1 regular basis with limited available Sparse AutoEncoder.

Over 3.2 retrieval mode

What is multimodal Index: FIG acoustic search + Example text search to FIG.

l Bimodal DBN

corresponds to from the encoder ( Correspondence Autoencoder): from the encoder by a two monomodal composition, which is responsible for each encoder corresponding modes learns

the corresponding across models from the encoder ( Correspondence the Modal Cross-Autoencoder): right and left parts are cross-modality from the encoder, the learning image represents text modality and each modality considered

corresponding to the full mode from the encoder ( Correspondence Full-Autoencoder the Modal): left and right sides, respectively, to a monomodal input and output of a reconstructed image, and text, from the corresponding synthesis and a corresponding mode encoder from the encoder

l What kind of a good multi-modal neural network

3.3 NER

l graphic mixing NER

  1. Application and Practice

4.1 Optimization

l What is optimization:

Application in depth study of optimization l

l Problems and Solutions

l Introduction to various types of optimization methods

l Application compare

4.2 parameter adjustment method

l parameter adjustment techniques

grid search ( Grid Search)

l optimal solution

l owe fitting and over-fitting

l prevent over-fitting

l Advanced parameter adjustment

4.3 Curriculum Practice

Guess you like

Origin www.cnblogs.com/Kobaayyy/p/11346868.html