table of Contents
In this section, the author summarizes some of some of the visual interpretation of the proximal
Moreau-Yosida regularization
Inner convolution (infimal Convolution):
\ [(F \: \ Box \: G) (V) = \ inf_x (F (X) + G (VX)) \]
Moreau-Yosida envelope or Moreau-Yosida regularization is:
\ [M _ {\} the lambda F = \ the lambda F \: \ Box \: (1/2) \ | \ CDOT \ | _2 ^ 2 \] , then:
in fact this is, we mentioned in the previous section had something. Like in the previous section, it can be shown that:
\ [m_F (X) = F (\ mathbf Prox} {(X)) + (1/2) \ | X-\ mathbf} {Prox _F (X) \ | _2 ^ 2 \]
and:
\ [\ nabla M _ {\ lambda_f} (X) = (. 1 / \ the lambda) (X-\ mathbf {Prox} _ {\ the lambda F} (X)) \]
Although the above I do not know \ (F \) under conditions of non-differentiable how proof.
then there are the same results as the previous section:
boils down to this, the proximal end of the operator, in fact, is to minimize \ (M _ {\ lambda f } \) equivalent to \ (\ nabla of M_ {F} ^ * \) , i.e.:
\ [\ mathbf} {Prox _F (X) = \ nabla of M_ {*} ^ F (X) \]
this requires decomposition by Moreau get.
Contact times gradient \ (\ mathbf {prox} _ {\ lambda f} = (I + \ lambda \ partial f) ^ {- 1} \)
The above formula, there is a problem, the mapping is single-valued function it (thesis also speak with more appropriate in terms of the relationship), because (\ partial f \) \ reasons, however, the paper seemed to be the , but this does not affect the proof:
Improved gradient path
就像在第一节说的,和之前有关Moreau envelope表示里讲的:
\[ \mathbf{prox}_{\lambda f} (x) = x - \lambda \nabla M_{\lambda f}(x) \]
实际上,\(\mathbf{prox}_{\lambda f}\)可以视为最小化Moreau envelope的一个迭代路径,其步长为\(\lambda\). 还有一些相似的解释.
假设\(f\)是二阶可微的,且\(\nabla^2 f(x) \succ0\)(表正定),当\(\lambda \rightarrow 0\):
\[ \mathbf{prox}_{\lambda f} (x) = (I + \lambda \nabla f)^{-1} (x) = x - \lambda \nabla f(x)+o(\lambda) \]
这个的证明,我觉得是用到了变分学的知识:
\[ \delta(I+\lambda \nabla f)^{-1}|_{\lambda=0}=-\frac{\nabla f}{(I+\lambda \nabla f)^{-2}}|_{\lambda =0}= -\nabla f \]
所以上面的是一阶距离的刻画.
我们先来看\(f\)的一阶泰勒近似:
Operator to its proximal end:
sense, actually is: \ (\ mathbf Prox} _ {{\ the lambda \ _V Hat {F} ^ {(. 1)}} \)
Accordingly, there are second order approximation:
This is Newton's method Levenberg-Marquardt update, although I do not know what this stuff Yes.
The above proof is easy, the definition will be able to export more direct.
Domain trust problem
proximal domain trust can also be used to explain the problem:
The proximal common problem:
the constraints into a penalty term, the paper also pointed out that, by specifying different parameters \ (\ rho \) and \ (\ the lambda \) , two questions can reach each other's solution.