吴恩达 deeplearning.ai 专项课程第一课第二周笔记 Neural Networks and Deep Learning

吴恩达 deeplearning.ai 专项课程第一课
Neural Networks and Deep Learning 笔记

基于coursera的honor code，代码将不会直接贴出来，
下面为作业的一些翻译和个人总结，提供一些解决思路和笔记

第一次作业 Python Basics with numpy (optional)

本次作业你将会学到:

学习如何使用numpy
实现一些基本深度学习代码：softmax, sigmoid, dsigmoid,等等
学习如何用归一化和重塑图像处理数据
了解向量化的重要性
理解python的广播机制

1.1 - Building basic functions with numpy

本节目的：理解np.exp() 优于 math.exp().

作业:
为用这两个函数分别写出sigmoid函数，然后比较
本节总结：np.exp() 支持向量输入，math.exp()不支持

练习：建立一个能够返回实数X的sigmoid，使用maath.exp(x)
注意： $sigmoid(x) = \frac{1}{1+e^{-x}}$ 有时也被当作逻辑回归。这是一个既用于机器学习，也用于深度学习的非线性函数

如果你将一个矩阵输入math.exp(x)函数中，会报错
One reason why we use “numpy” instead of “math” in Deep Learning

事实上，如果 $x = (x_1, x_2, ..., x_n)$ 是一个行向量， $np.exp(x)$ 函数将会把所有X中的元素应用在这个函数中. 输出将会为: $np.exp(x) = (e^{x_1}, e^{x_2}, ..., e^{x_n})$

1.2 - Sigmoid gradient

作业:

将1.1中写的sigmoid函数赋给s.

计算 $\sigma'(x) = s(1-s)$ （注意：本处也需要用）
sigmoid求导为何等于a(1-a)的形式？详见激活函数求导证明

使用函数：
多项式求导函数：np.polyder(poly) #返回导函数的系数
关于numpy的多项式及求导，详见Numpy 多项式函数、求导

1.3 - Reshaping arrays

np.shape and np.reshape()函数
1. X.shape 得到X的维度
1. X.reshape(…) 用于将X变成其它维度的矩阵

①np.reshape()函数
官方介绍(英文)
numpy.reshape(a, newshape, order=‘C’)

a：待处理的矩阵

newshape：新的矩阵的格式，使用int或tuple of ints表示
例如：5，（2,3）

order：取值范围{‘C’, ‘F’, ‘A’} 默认为C
C：横着读，横着写
按索引读取a，并按索引将元素放到变换后的的矩阵中
F：竖着读，竖着写
A：竖着读，横着写
具体效果如下

#原始矩阵
   [0, 1], 
   [2, 3],       
   [4, 5]
# C :
   [0, 1, 2],       
   [3, 4, 5]
# F：
    [0, 4, 3],
    [2, 1, 5]
# A：
    [0, 2, 4],
    [1, 3, 5]

1.4 - Normalizing rows

行归一化令 $x$ = $\frac{x}{\| x\|}$

np.linalg.norm(x, ord=None, axis=None, keepdims=False)函数

ord 设置具体范数值

axis 向量的计算方向

keepdims 设置是否保持维度不变

1.5 - Broadcasting and the softmax function

$\text{for } x \in \mathbb{R}^{1\times n} \text{, } softmax(x) = softmax(\begin{bmatrix} x_1 &&x_2 && ... && x_n \end{bmatrix}) = \begin{bmatrix} \frac{e^{x_1}}{\sum_{j}e^{x_j}} && \frac{e^{x_2}}{\sum_{j}e^{x_j}} && ... && \frac{e^{x_n}}{\sum_{j}e^{x_j}} \end{bmatrix}$
$\text{} x \in \mathbb{R}^{m \times n} \text{, $x_{ij}$ 为第 $i^{th}$行和$j^{th}$ 列}$
$softmax(x) = softmax\begin{bmatrix} x_{11} & x_{12} & x_{13} & \dots & x_{1n} \\ x_{21} & x_{22} & x_{23} & \dots & x_{2n} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ x_{m1} & x_{m2} & x_{m3} & \dots & x_{mn} \end{bmatrix} = \begin{bmatrix} \frac{e^{x_{11}}}{\sum_{j}e^{x_{1j}}} & \frac{e^{x_{12}}}{\sum_{j}e^{x_{1j}}} & \frac{e^{x_{13}}}{\sum_{j}e^{x_{1j}}} & \dots & \frac{e^{x_{1n}}}{\sum_{j}e^{x_{1j}}} \\ \frac{e^{x_{21}}}{\sum_{j}e^{x_{2j}}} & \frac{e^{x_{22}}}{\sum_{j}e^{x_{2j}}} & \frac{e^{x_{23}}}{\sum_{j}e^{x_{2j}}} & \dots & \frac{e^{x_{2n}}}{\sum_{j}e^{x_{2j}}} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ \frac{e^{x_{m1}}}{\sum_{j}e^{x_{mj}}} & \frac{e^{x_{m2}}}{\sum_{j}e^{x_{mj}}} & \frac{e^{x_{m3}}}{\sum_{j}e^{x_{mj}}} & \dots & \frac{e^{x_{mn}}}{\sum_{j}e^{x_{mj}}} \end{bmatrix} = \begin{pmatrix} softmax\text{(first row of x)} \\ softmax\text{(second row of x)} \\ ... \\ softmax\text{(last row of x)} \\ \end{pmatrix}$