Things you should know before diving headfirst into machine learning

Original link: click to open the link

Abstract:  This article briefly summarizes several major tasks of machine learning and their corresponding methods, so that beginners can choose the appropriate method according to their own tasks. Once you master the basics of machine learning and understand the tasks you are dealing with, applying machine learning is not that difficult.

Machine learning has always been a hot research field, and the introduction of deep learning methods has added fire to this field, making many people interested in this field and want to devote themselves to research in this field. So, what are the first things that anyone who wants to pursue a career in machine learning should know? This article will briefly introduce the basic knowledge of machine learning.
Machine learning refers to the process of enabling computer systems to learn from data using statistical techniques without the need for specific programming procedures. The method is an active learning algorithm that enables it to learn from data and make predictions. Machine learning is closely related to computational statistics, mathematical optimization, and data learning, and is often used for tasks such as prediction and analysis. Machine learning is generally used to handle two types of tasks:

  • Supervised learning : The examples input to the computer are labeled (expected output), and the built model is adjusted based on the labels to learn the mapping rules from input to output.
  • Unsupervised learning : The examples input to the computer have no labels, and the established model must learn to produce output by itself. Unsupervised learning involves discovering hidden patterns in data, including feature learning.
    The term machine learning sounds advanced to most people outside the field, but it's not. As long as you understand the basic concepts of machine learning and related methods, machine learning is actually very simple, that is, according to the relevant tasks, select the appropriate machine learning method, and let the machine learn and process the features to complete the corresponding tasks. Therefore, before learning and applying machine learning, we should first clarify what our task is and which machine learning method is suitable for it.

If we want to understand the basic theory behind algorithms and how they work, it is essential for us to be proficient in probability and statistics, linear algebra and calculus. In addition, knowing programming languages ​​such as Python will enable you to easily implement related algorithms, both theoretical foundations and programming skills, machine learning I have. In addition, it is also necessary to understand relevant mathematical knowledge and applications. Whether it is through offline self-study or online training and other learning methods, you must practice. Practice can increase your understanding of basic knowledge and exercise your programming. ability.
Before learning machine learning, it is necessary to have the following knowledge:

  • Linear Algebra
  • calculus
  • probability theory
  • programming
  • Optimization Theory
    Below are some of the most common machine learning tasks and related methods, which can be easily applied in subsequent projects after understanding them.

return

Regression mainly involves the estimation of continuous variables or numerical variables, such as estimating house prices, stock prices, product prices, etc. using regression estimation. That is, a regression curve is established according to the relevant data, and the new data is predicted and estimated. The following machine learning methods are used to solve regression problems:

  • Kernel regression
  • Support vector regression
  • Gaussian process regression
  • Linear regression
  • LASSO回归(Least absolute shrinkage and selection operator)
  • Regression tree

Classification

Classification is concerned with the prediction of discrete variables or categories of data. Tasks such as distinguishing between spam, which disease a patient has, and whether a transaction is fraudulent are handled using classification methods. The following methods can be used to solve classification problems:

  • Kernel discriminant analysis
  • Artificial neural networks
  • K-nearests neighbors
  • Boosted trees
  • Random forests
  • Logistic regression
  • Support vector machine
  • 深度学习(Deep learning)
  • 朴素贝叶斯(Naive Bayes)
  • 决策树(Decision trees)

聚类

聚类一般应用于数据自然分组。比如产品特征识别、客户细分等任务都是聚类的一些应用场景。以下机器学习方法用于聚类问题:

  • 均值漂移(Mean-shift)
  • K-均值(K-means)
  • 主题模型(Topic models)
  • 层次聚类(Hierarchical clustering)

多元查询

多元查询是用来寻找相似目标。下面的方法可用于解决与多元查询有关的问题:

  • 近邻取样(Nearest neighbors)
  • 最远邻居(Farthest neighbors)
  • 范围搜索(Range search)

降维

降维是指降低多个随机变量的维度,将其分为特征提取和特征选择。常用的降维方法如下:

  • 流线学习方法/核主成分分析(Manifold learning/KPCA)
  • 独立分量分析(Independent component analysis)
  • 主成分分析(Principal component analysis)
  • 非负矩阵分解(Non-negative matrix factorization)
  • 压缩感知(Compressed sensing)
  • 高斯图模型(Gaussian graphical models)

数十款阿里云产品限时折扣中,赶紧点击领劵开始云上实践吧!


Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324682991&siteId=291194637