[Github] NLPer-Interview: Interview questions related to NLP algorithm engineers

Over the weekend, I recommend the Github project compiled by Songyingxin / NLPer-Interview  . This warehouse mainly records interview questions related to NLP algorithm engineers :


https://github.com/songyingxin/NLPer-Interview


image


Lao Song is currently an algorithm engineer at Baidu and the author of a tea book club who knows Lao Song. The following is mainly from Lao Song’s description of the warehouse. Click " Read the original text " to go directly to the main page of the warehouse. Star is recommended. The content is quite rich.




This warehouse mainly records my accumulation of NLP-related knowledge. I have made a lot of notes before. Considering that the autumn recruitment has arrived, I will gradually clear this knowledge in the review process, and then organize the relevant knowledge notes into special topics. Help me review better.

At the same time, open source is released, I hope everyone can help me make up the related technology stack to see where I am weaker, and also help all the partners in the autumn recruitment to better review. If you want to work with students, you can contact me. After all, it is a bit difficult to do so much by one person. Fortunately, I did a lot of notes in the previous period.


It is recommended to use Typora editor to open, what you see is what you get.

Contents

1. Basic Programming Language

This folder mainly records some language details of python and c++. After all, these two major languages ​​are mainstream, and they are basically required. Currently, we are still checking the gaps.

  • C++ interview questions

  • Python interview questions

2. Mathematical foundation

This folder mainly records some mathematics-related knowledge, including high numbers, linear algebra, probability theory and information theory, Lao Song's personal experience, will ask, and is currently still in the process of finding out the gaps.

  • Probability theory

  • advanced mathematics

  • Linear algebra

  • Information Theory

3. Computer basic theoretical knowledge

This part of the content is generally not tested very much, so I did not focus on it, at least now I have almost no questions about this aspect. What is interesting is that I voted for the NLP algorithm of a certain department of Ali, and there is actually someone who does not understand NLP. , The whole process is really nonsense, it's all about development.

4. Machine Learning Fundamentals

This part has already begun to enter the topic. Facts have proved that some major manufacturers will mention some basic machine learning algorithm knowledge. Therefore, I think that several core models are necessary for this part.

  • Machine learning project process

  • Discriminant model vs. generative model

  • Frequency Pie vs Bayesian

  • Data preprocessing

  • Feature engineering

  • Feature Engineering-Association Rules

  • Model-SVM

  • Model-clustering algorithm

  • Model-Decision Tree

  • Model-Logistic Regression

  • Model-Naive Bayes

  • Model-Random Forest

  • Model-linear regression

5. Basics of Deep Learning

This part mainly describes the basic knowledge of deep learning, which is the core point, but in many cases, the questions of many interviewers are basically the same, but I personally think that it is beneficial to have such an overall and comprehensive knowledge framework.

  • Deep learning project process

5.1 Basic theory

  • Basic Theory-Multi-Task Learning

  • Basic Theory-Integrated Learning

  • Basic Theory-Evaluation Index for Classification Problems

  • Basic Theory-Distance Measurement Method

  • Basic theory-objective function, loss function, cost function

  • Basic Theory-Bias vs. Variance, Underfitting vs Overfitting

  • Basic Theory-Deep Learning from a Data Perspective

  • Basic theory-gradient disappearance, gradient explosion problem

  • Basic Theory-Curse of Dimensionality

  • Basic Theory-Exponential Weighted Average

  • Basic theory-local minimum, saddle point

  • Basic Theory-Integrated Learning

  • Basic Theory-Integrated Learning

5.2 Basic unit

  • Basic unit-CNN

  • Basic unit-MLP

  • Basic unit-RNN

5.3 Tuning related

  • Parameter tuning-hyperparameter tuning

  • Tuning-activation function

  • Tuning-weight initialization scheme

  • Tuning-optimization algorithm

5.4 Tricks

  • Trick - Dropout

  • Trick - Normalization

  • Trick-Fusion training set, validation set, test set

  • Trick-early termination

  • Trick-learning rate decay

  • Trick-regularization

6. Statistical Natural Language Processing

There are not many early notes in this part, so I haven't started much yet.

7. Deep learning natural language processing

This part can be regarded as the core knowledge, and this part needs to be gradually improved. Time is a bit tight.

  • Text data preprocessing

  • Evaluation indicators for major tasks

  • Some ideas for improving the NLP model

7.1 Trilogy of Word Vectors

  • Word Vector-Word2Vec

  • Word Vector-Glove

  • Word vector-FastText

7.2 Pre-trained language model

  • Pre-training language model-BERT improvement research

  • Pre-trained language model-integrated into the knowledge graph

  • Pre-trained language model-natural language generation

7.3 Attention mechanism

7.4 Text classification

7.5 Semantic matching

7.6 Reading Comprehension

8. Source code reading

This part mainly recommends some source codes that I have read. Some source codes are related to NLP, some are related to deep learning, and some of the source codes are personally annotated and will be listed accordingly.

9. The old Song slag algorithm experience

This part is mainly about some insights during the interview process. Hey, I am almost autistic.

Reference

[1] DeepLearning-500-questions - a good warehouse

[2] Algorithm_Interview_Notes-Chinese - The knowledge is relatively old, but it is also very good

Others are mainly my daily accumulation and reading papers.



About AINLP

AINLP is an interesting natural language processing community with AI, focusing on the sharing of AI, NLP, machine learning, deep learning, recommendation algorithms and other related technologies. Topics include text summarization, intelligent question answering, chat robots, machine translation, automatic generation, and knowledge Graphs, pre-training models, recommendation systems, computational advertisements, recruitment information, job search experience sharing, etc. Welcome to follow! Please add AINLPer (id: ainlper) to add technical exchange group, note work/research direction + add group purpose.


image


Guess you like

Origin blog.51cto.com/15060464/2675663