The field of artificial intelligence involves so much knowledge that you will inevitably lose yourself if you learn too much. Recently I saw a German website that listed a roadmap for AI experts, which was very detailed. With a map in hand, you can clearly know the missing skill points and the next step.
https://i.am.ai/roadmap/
Let’s briefly translate and summarize, and I particularly agree with several of the views:
1. Popularity and trends are not necessarily the best for a project
You should grow some understanding of why one tool would better suited for some cases than the other and remember hip and trendy never means best suited for the job.
2. Before entering the field of deep learning, it is best to be familiar with big data analysis and traditional machine learning
3. Basic knowledge
3.1 Basics
Basics of matrices and linear algebra, database basics (relational and non-relational databases, SQL operations and noSQL), tabular data, export and import transformations of data formats, regular expressions...
3.2 Python
Basic syntax (expressions, variables, data structures, functions, installation packages, programming style);
Numpy scientific computing library, Pandas table processing library;
Virtual environments, Jupyter, etc...
3.3 Data source
Data mining, web scraping, public datasets, Kaggle competitions
3.4 EDA data analysis
PCA component analysis, dimensionality reduction, normalization, data cleaning, missing value processing, unbiased estimation, feature value extraction, noise reduction, sampling...
It turns out that data scientists and big data engineers are two different directions.
4. Data scientist route
4.1 Statistics
Probability theory (randomness, probability distributions, conditional probability and Bayes' theorem), continuous distribution functions, cumulative distribution functions, summary statistics, estimation analysis, confidence spaces, Monte Carlo methods.
4.2 Visualization
Chart suggestions (various types recommended), Python visualization libraries (Matplotlab, seaborn, ipyvolume), Web visualization (D3.js, Dash), BI business intelligence (Tableau, PowelBI)
5. Machine learning field
5.1 Overview
Concepts, inputs and attributes, value function and gradient descent, overfitting and underfitting, training validation and test sets, accuracy and accuracy, bias and variance, Lift data analysis
5.2 Method
Supervised learning (regression, classification), unsupervised learning (clustering, association rule learning, dimensionality reduction), joint learning (Boosting, Bagging, Stacking), reinforcement learning (Q-learning)
5.3 Usage scenarios
Sentiment analysis, collaborative filtering, labeling, prediction
5.4 Tool Library
sklearn,spcay
After finishing machine learning, I finally entered the field of deep learning.
6. Deep learning field
6.1 Related papers
6.2 Neural Network
Neural network concepts, loss function, activation function, weight initialization, gradient disappearance and gradient explosion
6.3 Architecture
Forward neural network, autoencoder, convolutional neural network, recurrent neural network, Transformer (encoder, decoder, attention module), Siamese network, adversarial generative network (GAN), residual network
6.4 Training
Optimizer, learning rate, Batch Normal, Batch Size, regularization, multi-task training, transfer learning, Curriculum learning
6.5 Tools
Deep learning libraries, Tensorflow, PyTorch, Tensorboard, MLFlow
6.6 Model optimization
Model distillation, model quantification, neural network search
After learning this route, you will become a data scientist!
7. Data Engineer Route
Data format summary, data discovery, data sources and collection, data integration, data fusion, data transformation and filling, data exploration, OpenRefine, using ETL, data lake, Docker
8. Big data engineer route
8.1 Big data architecture
8.2 Principle
Vertical and horizontal scaling, Map Reduce, Data Gain, Name and Data Nodes, Task Tracking
8.3 Tools
Check the list of Big Data, Hadoop, Spark, Onnx, MLFlow, Cloud Services...
The following tools related to cloud deployment are beyond my current knowledge field... In short, the overall layout above is quite clear. Let's try to slowly fill in the missing pieces.