Course catalog:
1. Fundamentals of Algorithms and Machine Learning
Actual case
- Compilation of stock portfolio optimization strategies based on Sparse Quadratic Programming
- Short text similarity calculation based on Earth Mover's Distance
- Word vector learning based on Projected Gradient Descent and non-negative matrix factorization
- Ticket Pricing System Based on Linear Programming
- Analysis of text similarity based on DTW
Core knowledge
- Time complexity, space complexity analysis
- Master's Theorem, recursive complexity analysis
- Dynamic planning and Dynamic Time Warpping
- Earth Mover’s Distance
- Viterbi Algorithm
- LR, decision tree, random forest, XGBoost
- Gradient descent method, stochastic gradient descent method, Newton method
- Projected Gradient Descent
- L0, L1, L2, L-Infinity Norm
- Grid Search, Bayesian Optimization
- Convex function, convex set, Duality, KKT condition
- Linear SVM、Dual of SVM
- Kernel Trick, Mercer’s Theorem
- Kernelized Linear Regression、Kernelized KNN
- Linear/Quadratic Programming
- Integer/Semi-definite Programming
- NP-completeness/NP-hard/P/NP
- Constrained Relaxation、Approximate Algorithm
- Convergence Analysis of Iterative Algorithm
2. Language model and sequence labeling
Actual case
-
Construction of question answering system based on unsupervised learning method
-
Construction of Aspect-Based Sentiment Analysis System Based on Supervised Learning
-
Named entity recognition application based on CRF, LSTM-CRF, BERT-CRF
-
Spelling error correction based on language model and Noisy Channel Model
Core knowledge points:
-
Text preprocessing technology (tf-idf, Stemming, etc.)
-
Feature Engineering in the Text Domain
-
Inverted list, information retrieval technology
-
Noisy Channel Model
-
N-gram model, word vector introduction
-
Common Smoothing Techniques
-
Learning to Rank
-
Latent Variable Model
-
EM algorithm and Local Optimality
-
Convergence of EM
-
EM与K-Means, GMM
-
Variational Autoencoder与Text Disentangling
3. Information extraction, word vector and knowledge graph
4. Deep learning and NLP
5. Bayesian model and NLP
6. Capstone Open Project (Optional)
All resources are complete, the required +VX (daydayit), and the greedy NLP note