Background documents
1, first of all to look at what is over-fitting and underfitting? by Panda.X
2, the following briefly about the LRN, the full name of Local Response Normalization (normalized partial response function), a method of preventing over-fitting, after the active layer is generally used. This function is rarely used, is substantially similar Dropout substituted with such methods. For more information, refer to the algorithm [depth] LRN learning techniques partial response normalized by CrazyVertigo
Tensorflow official document tf.nn.lrn function gives a partial response normalized Paper source
See http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural- networks
LRN formula
The following variables to explain where the binding parameters tf.nn.lrn () of.
- A represents a formula Tensor input function, i.e., input: a 4D-Tensor, [batch, height, width, channel], the data type may be a half, bfloat16, float32
- k corresponding parameter of bias: offset value, for avoiding the denominator equal to 0, the default value is 1,
- Usually located positive float, may be set to None (default value).
- If the value of <= 0, may occur in the nan output
- α parameters corresponding to alpha: a scaling factor, a default value, typically set positive float, may be set to None (default value).
- β corresponding parameter of beta: index, the default value is 0.5, float type, may be set to None (default value).
- n / 2 corresponding to parameters depth_radius: used to define the length of the neighborhood, the default value is 5, int type (positive), may be set to None (default value).
- i: I refer to the channel, that is the accumulation operation is carried out along the direction of the input channel of Tensor.
- N: represents the total number of channels, i.e. channel value.
Is the square of a point along the channel direction (forward depth_radius + post depth_radius) plus points, multiplied by alpha, i.e. sqr_sum [A, B, C, D] = SUM (INPUT [A, B, C, D - depth_radius: depth_radius +. 1 + D ] ** 2)
output = input / (bias + sqr_sum)** beta
def lrn(input,
depth_radius=5,
bias=1,
alpha=1,
beta=0.5,
name=None
)