tf.train.MomentumOptimizer

Optimizer algorithm to achieve momentum. The following calculation expression (if use_nesterov = False):

accumulation = momentum * accumulation + gradient
variable -= learning_rate * accumulation

Note that in this intensive version of the algorithm, regardless of how much the value of the gradient will update the application and accumulation, while in sparse version (when the index gradient is sliced, usually because tf). When using only the front portion is transmitted to the variable, the variable is updated and the corresponding cumulative sheet items. 

 

__init__

__init__(
    learning_rate,
    momentum,
    use_locking=False,
    name='Momentum',
    use_nesterov=False
)

Construct a new momentum optimizer.

 

parameter:

learning_rate: Tensor or floating-point value. Learning rate.

momentum: Tensor or floating-point value.

use_lock: If you really want to use locks update operation.

name: The optional name prefix, created when the gradient is applied for operation. The default is "power."

If that is true, use Nesterov momentum. See Sutskever et Al.,, 2013 . This implementation always gradient calculated based on the value of a variable to the optimizer. Nesterov momentum tracking using the variable referred to herein theta_t + * value of v_t. This implementation is an approximation of the original formula for high momentum values. It calculates NAG "Adjusting the gradient", assuming that the new current average gradient by adding the product of the gradient and average gradient of momentum is estimated.

Eager Compatibility:

When the emergency function is executed, learning_rate and momentum can be a callable enabled takes no parameters and returns the actual value to be used. This is useful across different optimizer function calls to change these values.

 

method:

 

apply_gradients

apply_gradients(
    grads_and_vars,
    global_step=None,
    name=None
)

Applying a gradient of a variable, which is minimizethe second part (), which returns the operation of a gradient application.

 

parameter:

  • grads_and_vars: compute_gradients () return (gradient, variable) list.
  • global_step: Optional variable, the variable is incremented by one after the update.
  • name: Returns the optional name of the operation. The default name is passed to the constructor of the optimizer.

 

return:

  • Application of the operation specified gradient. If global_step not None, this operation will be incremented global_step.

 

abnormal:

  • TypeError: If grads_and_vars is malformed.
  • ValueError: If none of the variables have gradients.
  • RuntimeError: If you should use _distributed_apply() instead.

 

compute_gradients

apply_gradients(
    grads_and_vars,
    global_step=None,
    name=None
)

 Applying a gradient of a variable, which is the minimum of () a second portion, which returns the operation of a gradient application.

 

 

parameter:

  • grads_and_vars: compute_gradients () return (gradient, variable) list.
  • global_step: Optional variable, the variable is incremented by one after the update.
  • name: Returns the optional name of the operation. The default name is passed to the constructor of the optimizer.

 

return value:

  • Gradient operation specified by an application, if global_step not None, this operation will be incremented global_step.

 

abnormal:

  • TypeError: If grads_and_vars is malformed.
  • ValueError: If none of the variables have gradients.
  • RuntimeError: If you should use _distributed_apply() instead.

 

compute_gradients 

compute_gradients(
    loss,
    var_list=None,
    gate_gradients=GATE_OP,
    aggregation_method=None,
    colocate_gradients_with_ops=False,
    grad_loss=None
)

Calculated loss gradient var_list variables. This is to minimize () of the first portion. It returns a (gradient, variable) list, where "gradient" is the "variable" gradient. Note that the "gradient" may be a tensor, a slice index, or no, if not a given variable gradient.

 

parameter:

  • loss: Containing a value to be minimized tensor, or a call with no parameters tensor can return a value to be minimized. When you enable the implementation of an emergency, it must be called.
  • var_list: optional list or tuple of tf. Variables to be updated in order to minimize losses. The default value is a list of key variables in the chart GraphKeys.TRAINABLE_VARIABLES collected.
  • gate_gradients: How to calculate the gradient gate. It can be GATE_NONE, GATE_OP or GATE_GRAPH.
  • aggregation_method: Specifies the gradient method for merging items. Valid values ​​are defined in the class AggregationMethod.

 

return:

  • List of the (gradient, variable). Variables always there, but the gradient may be zero.

 

abnormal:

 

  • TypeError: If var_list contains anything else than Variable objects.
  • ValueError: If some arguments are invalid.
  • RuntimeError: If called with eager execution enabled and loss is not callable.

 

Eager Compatibility:

When the real-time execution is enabled, it will ignore gate_gradients, aggregation_method and colocate_gradients_with_ops.

 

get_name

get_name()

get_slot

get_slot(
    var,
    name
)

Some optimizer subclasses additional variables. For example, by using a variable momentum and Adagrad cumulative update. For example, by using a variable momentum and Adagrad cumulative update. If for some reason you need these variables objects, this method provides access to them. Use get_slot_names () Gets a list of optimizer slot created.

 

parameter:

  • var: Minimum pass variables () or apply_gradients () a.
  • name: A string.

 

return value:

  • If the variable slot created, there is no other variables.

 

get_slot_names

get_slot_names()

Returns the optimizer to create a list of names of the groove.

 

return value:

  • List of strings.

 

minimize

minimize(
    loss,
    global_step=None,
    var_list=None,
    gate_gradients=GATE_OP,
    aggregation_method=None,
    colocate_gradients_with_ops=False,
    name=None,
    grad_loss=None
)

By updating var_list, to minimize the loss of the add operation. This method is simply a combination of call compute_gradients () and apply_gradients (). If you want to deal with before you apply a gradient gradients, you can explicitly call compute_gradients () and apply_gradients (), rather than using this function.

 

parameter:

  • loss: tensor contains the value to be minimized.
  • global_step: Optional variable, the variable is incremented by one after the update.
  • var_list: optional object tuples or list of variables, for updating to minimize losses. The default value is a list of key variables in the chart GraphKeys.TRAINABLE_VARIABLES collected.
  • gate_gradients: How to calculate the gradient gate. It can be GATE_NONE, GATE_OP or GATE_GRAPH.
  • aggregation_method: Specifies the gradient method for merging items. Valid values ​​are defined in the class AggregationMethod.
  • colocate_gradients_with_ops: If true, try using the corresponding op to merge gradient.
  • name: Returns the optional name of the operation.
  • grad_loss: optional. Containing a gradient tensor is used to calculate the loss.

 

return value:

  • Update var_list of variables. If global_step not None, this operation will be incremented global_step.

 

abnormal:

  • ValueError: If some of the variables are not Variable objects.

Eager Compatibility 

When you enable the implementation of an emergency, loss should be a Python function that takes no arguments, and calculate the value to be minimized. Minimize (and gradient calculation) is done for the elements var_list, if not no, you can train for any variable during the execution of loss functions created. When you enable the implementation of an emergency, gate_gradients, aggregation_method, colocate_gradients_with_ops and grad_loss it will be ignored. 

 

variables

variables()

List of variables coding the current state of the optimizer. Including variable slot created in the current default figure by the optimizer and other global variables.

return value:

  • List of variables.

 

Original link: https://tensorflow.google.cn/versions/r1.14/api_docs/python/tf/train/MomentumOptimizer

Guess you like

Origin blog.csdn.net/weixin_36670529/article/details/91444036