Machine learning and mathematical analysis notes

  While learning some machine learning algorithm, which involves mathematical knowledge although slow connection, but in order to enhance memory and understanding as well as raising awareness of machine learning, and especially from the teachers together to learn about the specific kinds of math, of course just finished study mathematics or mathematical knowledge of children below can still skipped.

First understand what is machine learning

  For a given task T, on the premise of reasonable performance metrics of the program P, a computer program may experience self-learning task E T, along with providing appropriate, high-quality, a lot of experience E, the program performance of mission-T gradually increase.

  Here the most important thing is the object machine learning:

    1. Task Task, T, or a plurality of
    2 experience preference Experience, E
    3. Performance Performance, P

  Namely: With the mission, the accumulation of experience will bring to enhance the performance of your computer.

Another expression of what is machine learning

  Machine learning is a branch of artificial intelligence. We use a computer to design a system, it can in a certain way to learn the training data provided; the third training session, the system can continue to learn and improve on performance; by learning parameter optimization model can be used predictive output related issues.

Intension and extension machine learning

  What machine learning can be resolved

    A prediction problem of Data: Data Cleaning / feature selection algorithm to determine model parameter optimization, the results of prediction

  What can not be solved

    Large data storage parallel computing, a robot to do

Machine learning modeling process:

  1 are given the training data and corresponding tag, the training data including text, sound, images, and other transactions.
  Cleaning / feature data selection 2.
  3. Select the appropriate machine learning algorithms, the data start training, validation and verification on the current collector
  4. The obtained regression or classification model corresponding
  5 gives a new set of data comprising text, sound, images, transactions, etc., using the model to predict or classify.

Machine learning processes outlined in general represents: data collection → washing → data → data modeling features engineering

 

The following is a description of math point, there are three classes, so divided three notes:

First, look at an example: Find the value of S

problem analysis:

Order: for f (x) derivative

You can make such a question:

The number of columns configured X { n- }

So: there is an upper bound, therefore, there must be limits, denoted e.

The sides of clamp theorem function limits exist, as e.

Derivative especially common in machine learning

  Briefly, the derivative is the slope of the curve is the speed of the reaction curve

  The second derivative is the reaction speed of the slope change, characterization of convex and concave curves
 continuous second derivative of the curve, often referred to as "smooth" the.
 remember high school physics teacher often said do: always pointing direction of the acceleration trajectory curve concave side.

  The derivative can be another function of elementary functions f (x) = the derivative lnx further soled according to the formula, the inverse function derivative like to obtain.

Derivative of commonly used functions

Derivative Applications

Known function f (x) = X X , X> 0, find f (x) is a minimum value

Power exponent derivation function, it is first to be converted to logarithmic form so more convenient, or directly into an e exponentially.

Integral application N → ∞ → lnN! → N (lnN-1)

Taylor official - Maclaurin official

 Taylor Formula 1 application

Numerical calculation: Elementary function value (development at the origin)

Taylor Formula 2 application

, The relationship between the classification error rates of the three image inspection Gini coefficient entropy,
the f (x) = - lnx at x = 1 a deployment order, ignoring higher order infinitesimal give f (x) ≈1-x

These conclusions will be discussed further in the section tree

Direction 导数

如果函数 z=f(x,y) 在点 P(x,y) 是可微分的,那么,函数在该点沿任一方向 L 的方向导数都存在,且有:

其中, ψ 为 x 轴到方向 L 的转角。

梯度

设函数 z=f(x,y) 在平面区域 D 内具有一阶连续偏导数,则对于每一个点 P(x,y)∈D ,向量

为函数z=f(x,y) 在点 P 的梯度,记做 grad f(x,y)

梯度的方向是函数在该点变化最快的方向

  考虑一座解析式为 z=H(x,y) 的山,在 (x0, y0)的梯度是在该点坡度变化最快的方向。

Γ函数

 Γ函数是阶乘在实数上的推广:

凸函数

若函数 f 的定义域 domf 为凸集,且满足

 一阶可微

若 f 一阶可微,则函数 f 为凸函数当前仅当 f 的定义域 domf 为凸集,且

 

 

 二阶可微

若函数 f 二阶可微,则函数 f 为凸函数当前仅当 dom 为凸集,且

若 f 是一元函数,上式表示二阶导大于等于 0

若 f 是多元函数,上式表示二阶导 Hessian 矩阵半正定。

凸函数举例:

 

概率论

对概率的认识: P(x)∈[0,1]

  P=0 事件出现的概率为 0,但不一定不发生  

  若 x 为离散连续变量,则 P(x=x0) 表示 x0发生的概率/概率密度

累计分布函数: Φ(x)=P(x≤x0)

  Φ(x) 一定为 单增函数

 min(Φ(x))=0 ,max(Φ(x))=1

思考:将值域为 [ 的某单增函数 y=F(x) 看成 X 事件的累积概率函数

  若 y=F(x) 可导,则 f(x)=F’(x) 为某概率密度函数

古典概型

  举例:将 n 个不同的球放入 N(N≥n) 个盒子中,假设盒子容量无限,求事件 A={ 每个盒子至多有 1 个球 }的概率。

  解:

    

基本事件总数:

  第1个球,有N种放法;

  第2个球,有N种放法;

  ……

   共:Nn种放法。

每个盒子至多放1个球的事件数:

   第1个球,有N种放法;

   第2个球,有N-1种放法;

   第3个球,有N-2种放法;

   ……

  共:

组合数背后的秘密N→∞→lnN!→N(lnN-1)

Guess you like

Origin www.cnblogs.com/yang901112/p/11628739.html