SCA(successive convex approximation)学习

Reference 1 https://www.zhihu.com/question/424944253

successive: The meaning of continuous is to complete through continuous iteration.

convex: That is to say, a convex function is used instead of a non-convex function during the iterative process.
insert image description here

Reference 2 https://zhuanlan.zhihu.com/p/164539842

insert image description here
insert image description here
The first three requirements of the two are the same, namely

  1. Approximate function continuity
  2. The approximate function and the original function have the same function value at the approximate point
  3. The approximate function and the original function have the same first derivative (direction derivative) at the approximate point

The fourth point is different.
SCA requires the approximate function to be a convex function, while MM requires the approximate function to be the upper bound of the original function at the approximate point (on the original function "above").

Part II A. SCA algorithm

The emergence of SCA is to solve the problem that it is difficult to find an approximate function that satisfies the conditions of MM in practical applications (mainly the fourth point, it is difficult to find an approximate function that satisfies the upper bound and is easy to solve). However, according to the principle of no free lunch, we save effort in finding approximate functions, and we have to pay more effort in solving them. This is because if the approximation function does not satisfy the upper bound, then directly taking the minimum value of the approximation function will lead to the situation of "too big step" and "passed". As shown in Figure 1 yt + 1 y^{t+1}yt + 1 is the minimum value of the approximation function, which "exceeds" the local minima of the objective function. Therefore, the step size needs to be adjusted. The adjustment method is very simple, using moving average (moving average?), the formula is as follows
insert image description here
Block Successive Convex Approximation
insert image description here

Reference 1 https://www.zhihu.com/question/424944253

successive: The meaning of continuous is to complete through continuous iteration.

convex: That is to say, a convex function is used instead of a non-convex function during the iterative process.
insert image description here

Reference 2 https://zhuanlan.zhihu.com/p/164539842

insert image description here
insert image description here
The first three requirements of the two are the same, namely

  1. Approximate function continuity
  2. The approximate function and the original function have the same function value at the approximate point
  3. The approximate function and the original function have the same first derivative (direction derivative) at the approximate point

The fourth point is different.
SCA requires the approximate function to be a convex function, while MM requires the approximate function to be the upper bound of the original function at the approximate point (on the original function "above").

Part II A. SCA algorithm

The emergence of SCA is to solve the problem that it is difficult to find an approximate function that satisfies the conditions of MM in practical applications (mainly the fourth point, it is difficult to find an approximate function that satisfies the upper bound and is easy to solve). However, according to the principle of no free lunch, we save effort in finding approximate functions, and we have to pay more effort in solving them. This is because if the approximation function does not satisfy the upper bound, then directly taking the minimum value of the approximation function will lead to the situation of "too big step" and "passed". As shown in Figure 1 yt + 1 y^{t+1}yt + 1 is the minimum value of the approximation function, which "exceeds" the local minima of the objective function. Therefore, the step size needs to be adjusted. The adjustment method is very simple, using moving average (moving average?), the formula is as follows
insert image description here
Block Successive Convex Approximation
insert image description here
insert image description here
The red box is to find the parameterα \alphaAlpha process.
Compared with MM,
insert image description here
there are more search parameters and moving average processes.

Convergence of SCA Algorithm

insert image description here
insert image description here
insert image description here
[2]的文章 Regularized Robust Estimation of Mean and Covariance Matrix Under Heavy-Tailed Distributions https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=7069228

参考4 Stochastic Successive Convex Approximation for Non-Convex Constrained Stochastic Optimization https://arxiv.org/pdf/1801.08266.pdf

论文Meisam Razaviyayn, “Successive Convex Approximation: Analysis and Applications”

https://conservancy.umn.edu/bitstream/handle/11299/163884/Razaviyayn_umn_0130E_14988.pdf?sequence=1

Block coordinate descent (BCD) is widely used to find the minimum value of a continuous function f of multiple block variables. In each iteration of the method, a single block of variables is optimized, while the remaining variables are held fixed. In order to ensure the convergence of the BCD algorithm, the sub-problem of each block variable needs to be solved to its unique global optimal solution. Unfortunately, this requirement is often too restrictive for many practical scenarios. In this paper, we first investigate an alternative inexact BCD method that updates a block of variables by successively minimizing a series of approximations to f that are either local tight upper bounds on f or that Strictly convex local approximation. Consider different block selection rules such as round robin (Gauss-Seidel), greedy (Gauss-Southwell), randomization or even multiple (parallel) simultaneous blocks. We characterize the convergence conditions and iterative complexity bounds for such methods, especially for the case where the objective function is non-differentiable or non-convex. At the same time, using the idea of ​​Alternating Direction Multiplier Method (ADMM), the case of linear constraints is briefly studied. In addition to the deterministic case, the problem of minimizing the expected value of a cost function parameterized by random variables is studied. Based on the idea of ​​successive convex approximation, an inexact sample average approximation (SAA) method is proposed, and its convergence is studied. Our analysis unifies and generalizes the existing convergence results of many classical algorithms, such as BCD method, convex function difference (DC) method, expectation maximization (EM) algorithm, and classical stochastic (sub)gradient (SG) method, which algorithms are all popular algorithms for large-scale optimization problems.
In the second part of the paper, we apply our proposed framework to two practical problems: interference management in wireless networks and the problem of dictionary learning for sparse representations. First, the computational complexity of these problems is investigated. Then, using the successive convex approximation framework, new algorithms for solving these practical problems are proposed. The proposed algorithm is evaluated through extensive numerical experiments on real data.

参考4 Stochastic Successive Convex Approximation for Non-Convex Constrained Stochastic Optimization https://arxiv.org/pdf/1801.08266.pdf

https://www.cnblogs.com/kailugaji/p/11731217.html

G. Scutari, F. Facchinei, P. Song, D. P. Palomar, and J.-S. Pang, “Decomposition by partial linearization: Parallel optimization of multiuser systems,” IEEE Trans. on Signal Processing, vol. 63, no. 3, pp. 641–656, Feb. 2014.

F. Facchinei, G. Scutari, and S. Sagratella, “Parallel selective algorithms for nonconvex big data optimization,” IEEE Trans. on Signal Processing, vol. 63, no. 7, pp. 1874–1889, April 2015.

Guess you like

Origin blog.csdn.net/qq_45542321/article/details/128699170