caffe :activation layer

在激活层中，对输入数据进行激活操作（实际上就是一种函数变换），是逐元素进行运算的。从bottom得到一个blob数据输入，运算后，从top输入一个blob数据。在运算过程中，没有改变数据的大小，即输入和输出的数据大小是相等的。

输入：n*c*h*w

输出：n*c*h*w
先贴一张图吧，不知道激活函数具体形式的同学可以参考：
这里写图片描述
1) ReLU / Rectified-Linear and Leaky-ReLU/PReLu 这几个是一个家族的，差别不大，不一一列举；
sample:

layer {
  name: "relu1"
  type: "ReLU"
  bottom: "pool1"
  top: "pool1"
}

ReLU是目前使用最多的激活函数，主要因为其收敛更快，并且能保持同样效果。

标准的ReLU函数为max(x, 0)，当x>0时，输出x; 当x<=0时，输出0

f(x)=max(x,0)

层类型：ReLU

可选参数：

　　negative_slope：默认为0. 对标准的ReLU函数进行变化，如果设置了这个值，那么数据为负数时，就不再设置为0，而是用原始数据乘以negative_slope

RELU层支持in-place计算，这意味着bottom的输出和输入相同以避免内存的消耗。
2）Sigmoid
例子：

layer {
  name: "conv_1"
  bottom: "pool_1"
  top: "conv_1"
  type: "Sigmoid"
}

在前期的dl中经常使用，现在已经慢慢摒弃这个激活函数;对每个输入数据，利用sigmoid函数执行操作。这种层设置比较简单，没有额外的参数。列出几个缺点：容易导致梯度消失、幂计算开销大

3）TanH /Hyperbolic Tangent

例子：

layer {
  name: "layer"
  bottom: "in"
  top: "out"
  type: "TanH"
}

性质与sigmod 很像
4）Absolute Value
例子：

layer {
  name: "layer"
  bottom: "in"
  top: "out"
  type: "AbsVal"
}

f(x) = Abs(x)
5）power
sample：

layer {
  name: "layer"
  bottom: "in"
  top: "out"
  type: "Power"
  power_param {
    power: 2
    scale: 1
    shift: 0
  }
}

对每个输入数据进行幂运算

f(x)= (shift + scale * x) ^ power

层类型：Power

可选参数：

　　power: 默认为1

　　scale: 默认为1

　　shift: 默认为0
6） Exp
sample：

layer {
  name: "layer"
  bottom: "in"
  top: "out"
  type: "Exp"
  power_param {
    power: 2
    scale: 1
    shift: 0
  }
}

层类型：Exp
可选参数：

　　base: 默认为-1

　　scale: 默认为1

　　shift: 默认为0
- f(x) = base ^ (shift + scale * x).
7)BNLL
例子：

layer {
  name: "layer"
  bottom: "in"
  top: "out"
  type: “BNLL”
}

没有参数
函数原型：f(x)=log(1 + exp(x))