How can undergraduates teach themselves machine learning?

foreword

In order to really help students who are interested in machine learning, I read a lot of information on the Internet, combined with my own experience during graduate school, and sorted out the following mind map on how to systematically learn machine learning.

img
[----Help related technology learning, all the following learning materials are free at the end of the article! ----】

1 step

Before officially embarking on the journey of machine learning, we need to understand the steps of machine learning. So what kind of pre-knowledge do we need?

1.1 Prerequisite knowledge

English listening and reading ability

First of all, master the ability of listening and reading in English. The courses and papers in the field of machine learning are almost all in English. If we want to understand or understand, we need to master a certain level of English ability. If it is really difficult, we can also use translation software such as Google Translate and DeepL.

math

As we all know, computer science cannot do without mathematics . Machine learning naturally requires a certain foundation of university mathematics: calculus, linear algebra, and statistics.

  1. Advanced Mathematics: Derivatives, Differentials and Integrals, Taylor Expansion (these are also required for postgraduate entrance examinations)

  2. Linear algebra: matrices, vectors, recommend the famous science video up master 3Blue1Brown's topic Essence of linear algebra, the link of station B is here.

  3. Statistics and probability theory: conditional probability, expectation, variance, regression and fitting problems, Bayes' rule

About 80% of the time in machine learning will be spent collecting and cleaning data. Statistics is the field that deals with the collection, analysis and presentation of data.

1.2 Programming skills

Choose a programming language: Python

Then, in order to be able to practice machine learning, not just staying in textbooks and written words, you must master a programming language. Now more and more universities are offering Python programming courses, which is enough to see this language in the research field. popular. Some university engineering majors will offer Matlab as a course, and statistics and mathematics majors will learn R language or Scala.

It is important to say that language is not the key , as long as you like any language, you can use it to get started and learn. But in order to reduce repeated wheel creation, Python's concise syntax and its many tools and resources in the field of data science, in fact, there are many Python libraries, these third-party libraries are particularly useful for artificial intelligence and machine learning, such as Keras, Tensorflow, Scikit-Learn, etc., so Python is still recommended.

Book and Course Recommendations

There are many books on the Internet about getting started with Python, and here are only advanced Python books that are helpful for machine learning: "Smooth Python" and "href="">use f="">Python f="">for data analysis ",

1.3 What is machine learning

Now that we have completed the prerequisites, we can move on to the core parts of machine learning! Of course, start from the basics, and then learn more complex things.

Machine Learning Concepts

IBM's Arthur Samuel (known as the "Father of Machine Learning") coined the term "Machine Learning" in 1959 and defined Machine Learning as: A field of study that gives computers the ability to learn when programmed.

Machine learning is the part of artificial intelligence that combines data with statistical tools to predict outputs that can be used to craft actionable insights.

Machine Learning Terminology

Model : A model is a specific representation learned from data by applying some machine learning algorithm. Models are also known as hypotheses.

Feature : A feature is a single measurable attribute of data. A set of numerical functions can be conveniently described by a function vector. The feature vectors are used as model input. For example, to predict fruit, there may be features such as color, smell, taste, etc.

Target (label) : The target variable or label is the value our model wants to predict. For the fruit example discussed in the features section, the labels for each set of inputs will be the name of the fruit, such as apple, orange, banana, etc.

Training : The idea is given a set of inputs (features) and its expected output (labels), so after training we will have a model (hypothesis) that then maps new data to one of the trained classes.

• Prediction: Once our model is trained, given a set of inputs, it will give a predicted output (label).

Types of machine learning

Supervised learning : This involves using classification and regression models to learn from a training dataset with labeled data. This learning process continues until the desired performance level is achieved.

Unsupervised learning: This involves using unlabeled data and then finding the underlying structure in the data in order to learn more and more about the data itself using factor and cluster analysis models.

Semi-supervised learning : This involves using unlabeled data, such as unsupervised learning with a small amount of labeled data. Using labeled data greatly improves the accuracy of learning and is more cost-effective than supervised learning.

• Reinforcement Learning: This involves learning the best action through trial and error. So the next action is determined by the learned behavior based on the current state, which will maximize the reward in the future.

7 Steps to Machine Learning

img

  1. Data collection: The most time-consuming part of machine learning may be data collection, and obtaining high-quality data from a large amount of data often requires many steps. Many well-known data sets are also recommended here: Amesome data set, Kaggle data set and so on.

  2. Data processing: This is where our statistical knowledge comes in handy, data integration, cleaning, optimization.

  3. Model Selection: Learn various models and practice on real datasets. Help us wrap our heads around which type of model is appropriate in different situations.

  4. Training: The results obtained using different models may also be different, requiring a lot of parameter tuning and training.

  5. Evaluate: Evaluate how good the model is against the training results.

  6. Hyperparameter Tuning: If the evaluation is successful, proceed to the hyperparameter tuning step. This step tries to improve on the positive results obtained in the evaluation step.

  7. Prediction: The final step in the machine learning process is prediction. At this stage, we consider the model to be ready for practical use.

img

Resource recommendation

Wu Enda Machine learning by Andrew Ng : Wu Enda's machine learning course at Stanford University is very popular. Its focus is on machine learning, data mining, and statistical pattern recognition, with instructional videos that are very helpful in understanding the theory and core concepts behind ML.

Stanford Li Feifei cs231n computer vision course

• Li Hongyi Machine Learning

• Lin Xuantian: Cornerstones and Techniques

book recommendation

• Watermelon Book: "Machine Learning" by Mr. Zhou Zhihua

• Pumpkin Book: [Detailed Explanation of Machine Learning Formulas]

• The second edition of "Statistical Learning Methods" by Mr. Li Hang, with the 15.6 k star ref="https://github.com/fengdu78/lihang-code"> code on Github, it is better to learn together!

• "Machine Learning in Practice": k-Nearest Neighbor Algorithm, Naive Bayesian Algorithm, Logistic Regression Algorithm, Support Vector Machine, AdaBoost Integration Method, Tree-Based Regression Algorithm and Classification Regression Tree (CART) Algorithm, etc. are all here.

• 《Pattern Recognition and Machine Learning 》

• 《Element of Sta tistic Learning 》

• "Hands-on Deep Learning": The feature of this book is that each section is a Jupyter notebook that can be downloaded and run. It combines text, formulas, images, codes and running results. There is also an online version, click href= "">here.

• Huashu "Deep Learning": Those who have enough energy to learn recommend reading Huashu

If you feel that these books are not enough, I recommend you to go to Douban f="">Machine Learning Topic.

2 learn good tools

When you are getting more and more familiar with the field of machine learning, a good tool can help you practice faster. The purpose of machine learning is the idea that a machine can learn from data (i.e. examples) to produce accurate results. Machine learning is closely related to data mining and Bayesian predictive modeling. The machine takes data as input and uses algorithms to come up with answers. So it is recommended to learn the common tools in data science.

Tools of the Mathematical Sciences

If it is Python development, it is recommended to learn Anaconda:

• Jupyter

• numpy

• pandas

• matplotlib

• seaborn

• …

Frameworks in the field of machine learning

• scikit-learn

• PyTorch

• TensorFlow

• Caffe

• Theano

• …

3Choose a field

Machine learning covers a wide range of fields, such as data mining, ad recommendation, image recognition, fraud detection, portfolio optimization, task automation, autonomous driving, etc. Every field is worthy of deep digging, choose a field you are interested in and continue to explore:

• Data analysis and mining

• Natural Language Processing NLP

• Computer Vision CV

• Recommendation algorithm

• …

4 Problems and Practice

Finally, after understanding the basic knowledge and technical application scenarios of machine learning, we can solve practical problems by how to use machine learning.

For example, you can strengthen your practical ability through competitions or by studying excellent cases from previous years. This process will combine theoretical knowledge with practical implementation, so as to become more proficient in machine learning, and colleagues can also reap rich rewards.

• HUAWEI CLOUD competition platform

• Kaggle

• …

You can also pay attention to the latest conferences to see what kind of problems your peers have solved, whether you can reproduce, optimize, or even publish an excellent paper that has been peer-reviewed. You can pay attention to these top conferences in the AI ​​​​field:

• AIIMS

• SIGKDD

• IJCAI

• AISTATS

• …

img

If you can finish these competitions and other such simple challenges, or if you have published multiple top conference papers...

Congratulations! ! ! You are becoming a full-fledged machine learning engineer, and you can continue to build on your skills by taking on more and more challenges, and ultimately creating increasingly creative and difficult machine learning projects.

If you still want to continue to work in the field of AI and become an algorithm engineer or AI engineer, I recommend "Bai Mian Machine Learning".

5 summary

5.1 Why is machine learning so popular?

Because machine learning is having a huge impact on the way software is designed. This enables it to keep up with the pace of business change. What makes machine learning so compelling is that it helps you use data to drive business rules and logic.

In the traditional model of software development, programmers write logic based on the current state of the business and then add relevant data. However, business changes have become the norm. It is almost impossible to predict what changes will change the market with the traditional development model. Machine learning allows us to continuously learn from data and predict the future by training computers. This powerful set of algorithms and models is being used across industries to improve processes and gain insight into patterns and anomalies in data.

5.2 Why is machine learning difficult?

That's because there is a lot of knowledge behind it, and the learning threshold is very good, and many people may have been persuaded to quit at a certain stage. There are also many recommended learning modes and videos on the Internet, and major platforms are also offering their own courses. However, when the author sorted out this systematic learning method, he found that many core points are agreed by everyone:

• Have a machine learning mindset: including recognition and love for machine learning, and perseverance in continuous learning.

• Emphasize hands-on and practical skills: After the hustle and bustle of machine learning, the engineers who really solve practical problems are left behind.

Today, machine learning affects our lives all the time, intelligent advertising recommendations, face recognition, automatic driving... Its influence will continue. If you also want to learn machine learning systematically, then start now, hope This article can help you.

Finally, I would like to thank everyone who has read my article carefully. Reciprocity is always necessary. Although it is not a very valuable thing, you can take it away if you need it:

1. Introduction to Python

The following content is the basic knowledge necessary for all application directions of Python. If you want to do crawlers, data analysis or artificial intelligence, you must learn them first. Anything tall is built on primitive foundations. With a solid foundation, the road ahead will be more stable.All materials are free at the end of the article!!!

Include:

Computer Basics

insert image description here

python basics

insert image description here

Python introductory video 600 episodes:

Watching the zero-based learning video is the fastest and most effective way to learn. Following the teacher's ideas in the video, it is still very easy to get started from the basics to the in-depth.

2. Python crawler

As a popular direction, reptiles are a good choice whether it is a part-time job or as an auxiliary skill to improve work efficiency.

Relevant content can be collected through crawler technology, analyzed and deleted to get the information we really need.

This information collection, analysis and integration work can be applied in a wide range of fields. Whether it is life services, travel, financial investment, product market demand of various manufacturing industries, etc., crawler technology can be used to obtain more accurate and effective information. use.

insert image description here

Python crawler video material

insert image description here

3. Data analysis

According to the report "Digital Transformation of China's Economy: Talents and Employment" released by the School of Economics and Management of Tsinghua University, the gap in data analysis talents is expected to reach 2.3 million in 2025.

With such a big talent gap, data analysis is like a vast blue ocean! A starting salary of 10K is really commonplace.

insert image description here

4. Database and ETL data warehouse

Enterprises need to regularly transfer cold data from the business database and store it in a warehouse dedicated to storing historical data. Each department can provide unified data services based on its own business characteristics. This warehouse is a data warehouse.

The traditional data warehouse integration processing architecture is ETL, using the capabilities of the ETL platform, E = extract data from the source database, L = clean the data (data that does not conform to the rules), transform (different dimension and different granularity of the table according to business needs) calculation of different business rules), T = load the processed tables to the data warehouse incrementally, in full, and at different times.

insert image description here

5. Machine Learning

Machine learning is to learn part of the computer data, and then predict and judge other data.

At its core, machine learning is "using algorithms to parse data, learn from it, and then make decisions or predictions about new data." That is to say, a computer uses the obtained data to obtain a certain model, and then uses this model to make predictions. This process is somewhat similar to the human learning process. For example, people can predict new problems after obtaining certain experience.

insert image description here

Machine Learning Materials:

insert image description here

6. Advanced Python

From basic grammatical content, to a lot of in-depth advanced knowledge points, to understand programming language design, after learning here, you basically understand all the knowledge points from python entry to advanced.

insert image description here

At this point, you can basically meet the employment requirements of the company. If you still don’t know where to find interview materials and resume templates, I have also compiled a copy for you. It can really be said to be a systematic learning route for nanny and .

insert image description here
But learning programming is not achieved overnight, but requires long-term persistence and training. In organizing this learning route, I hope to make progress together with everyone, and I can review some technical points myself. Whether you are a novice in programming or an experienced programmer who needs to be advanced, I believe that everyone can gain something from it.

It can be achieved overnight, but requires long-term persistence and training. In organizing this learning route, I hope to make progress together with everyone, and I can review some technical points myself. Whether you are a novice in programming or an experienced programmer who needs to be advanced, I believe that everyone can gain something from it.

Data collection

This full version of the full set of Python learning materials has been uploaded to the official CSDN. If you need it, you can click the CSDN official certification WeChat card below to get it for free ↓↓↓ [Guaranteed 100% free]

insert image description here

Good article recommendation

Understand the prospect of python: https://blog.csdn.net/SpringJavaMyBatis/article/details/127194835

Learn about python's part-time sideline: https://blog.csdn.net/SpringJavaMyBatis/article/details/127196603

Guess you like

Origin blog.csdn.net/weixin_49892805/article/details/132057128