cs224n assignment 1总结

看完cs224有段时间了，回头做下作业总结下。

官网：http://web.stanford.edu/class/cs224n/

里面有相关的资料可以下载。

1 Softmax (10 points)
一个证明和一个实现：

我的实现：

np.exp(x) / np.sum(np.exp(x))

github上：

git上答案：
def softmax(x):
"""
Arguments:
    x -- A N dimensional vector or M x N dimensional numpy matrix.

    Return:
    x -- You are allowed to modify x in-place
    """
    orig_shape = x.shape

    if len(x.shape) > 1:
        # Matrix
        ### YOUR CODE HERE
        exp_minmax = lambda x: np.exp(x - np.max(x))
        denom = lambda x: 1.0 / np.sum(x)
        x = np.apply_along_axis(exp_minmax, 1, x)
        denominator = np.apply_along_axis(denom, 1, x)

        if len(denominator.shape) == 1:
            denominator = denominator.reshape((denominator.shape[0], 1))

        x = x * denominator
        ### END YOUR CODE
    else:
        # Vector
        ### YOUR CODE HERE
        x_max = np.max(x)
        x = x - x_max
        numerator = np.exp(x)
        denominator = 1.0 / np.sum(numerator)
        x = numerator.dot(denominator)
        ### END YOUR CODE

    assert x.shape == orig_shape
    return x

看了下这个帖子：https://www.aliyun.com/jiaocheng/524269.html，大致明白了。

apply_along_axis是执行一个函数，第一个参数是其函数，0/1对应行列，x即输入数据。

2 Neural Network Basics (30 points)

part a: 对sigmod函数求导

原函数：

扫描二维码关注公众号，回复： 3931387 查看本文章

求导：

part b: 求出softmax函数的交叉熵损失关于theta的导数。

softmax:

交叉熵：

求导：

这块标记下，面试必考，手推导。贴个链接：
https://blog.csdn.net/qian99/article/details/78046329
我也要手推导下去。
part c：单隐层神经网络关于参数的导数。

这块比较难了，重要的是一种思想，即前向过程的计算值，作为后向计算的中间值。建议看solid，吴恩达的视频这块将的非常好，一步一步推导的。包括矩阵形式的计算。
我记得一开始是看的国外一个知名的博客，Implementing a Neural Network from Scratch in Python – An Introduction。
https://blog.csdn.net/luoganttcc/article/details/63251234这个推导很详细。
part d：计算神经网络的参数个数

这里的1是偏执项。

参考：

[email protected]:zzb5233/CS224n.git

https://blog.csdn.net/han_xiaoyang/article/details/51760923

https://blog.csdn.net/u012416045/article/details/78237060

https://www.aliyun.com/jiaocheng/524269.html

cs224n assignment 1总结

猜你喜欢