Approximating Wasserstein distances with PyTorch学习

https://github.com/dfdazac/wassdistance/tree/master

prerequisite knowledge

Computational optimal transport learning.
Specifically, you can see that the coordinates of the entropy duality rise.

LC ε (a, b) = def. min ⁡ P ∈ U ( a , b ) ⟨ P , C ⟩ − ε H ( P ) \mathrm{L}_{\mathbf{C}}^{\varepsilon}(\mathbf{a}, \mathbf{b }) \stackrel{\text { def. }}{=} \min _{\mathbf{P} \in \mathbf{U}(\mathbf{a}, \mathbf{b})}\angle\mathbf{P}, \mathbf{C}\rangle -\varepsilon\mathbf{H}(\mathbf{P})LCe(a,b)= def. PU(a,b)minP,Cε H ( P )
U ( a , b ) = def. { P ∈ R + n × m : P 1 m = a and PT 1 n = b } \mathbf{U}(\mathbf{a}, \mathbf{b}) \stackrel{\text { def. }}{=}\left\{\mathbf{P}\in \mathbb{R}_{+}^{n\times m}: \mathbf{P}\mathbf{1}_m=\mathbf{a} \quad \text { and } \quad \mathbf{P}^{\mathrm{T}} \mathbf{1}_n=\mathbf{b}\right\}U(a,b)= def. { PR+n×m:P1 _m=a and PT 1n=b}

Let
LC ε ( a , b ) = max ⁡ f ∈ R n , g ∈ R m ⟨ f , a ⟩ + ⟨ g , b ⟩ − ε ⟨ ef / ε , K eg / ε ⟩ \mathrm{L}_{ \mathbf{C}}^{\varepsilon}(\mathbf{a}, \mathbf{b})=\max _{\mathbf{f} \in \mathbb{R}^n, \mathbf{g}\ in \mathbb{R}^m}\angle\mathbf{f}, \mathbf{a}\angle+\angle\mathbf{g}, \mathbf{b}\angle-\varepsilon\left\angle e^{\ mathbf{f}/\objectpsilon}, \mathbf{K}e^{\mathbf{g}/\objectepsilon}\right\rangleLCe(a,b)=fRn,gRmmaxf,a+g,beef / e ,K eg / ε
( u , v ) = ( ef / ε , eg / ε ) (\mathbf{u}, \mathbf{v})=\left(e^{\mathbf{f} / \varepsilon}, e ^{\mathbf{g} / \varepsilon}\right)(u,v)=(ef / e ,eg / e )

P = diag ( u ) K diag ( v ) , K = exp ( − C ϵ ) \mathbf{P}=\rm{diag}\left(\mathbf{u}\right)\mathbf{K}\rm{ diag}\left(\mathbf{v}\right),\quad\mathbf{K}=exp\left(-\frac{C}{\epsilon}\right)P=diag(u)K diag(v),K=exp(ϵC)

Let
f ( l + 1 ) = ε log ⁡ a − ε log ⁡ ( K eg ( l ) / ε ) , g ( l + 1 ) = ε log ⁡ b − ε log ⁡ ( KT ef ( l + 1 ) . / ε ) \begin{aligned} \mathbf{f}^{(\ell+1)} & =\itempsilon \log \mathbf{a}-\itempsilon \log \left(\mathbf{K} e^{\mathbf{g }^{(\ell)}/\itempsilon}\right), \\\mathbf{g}^{(\ell+1)} & =\itempsilon \log \mathbf{b}-\itempsilon\log\left (\mathbf{K}^{\mathrm{T}} e^{\mathbf{f}^{(\ell+1)} / \varepsilon}\right) \end{aligned}f(+1)g(+1)=elogaelog( E.g _g( ) /e),=elogbelog(KThat's itf( + 1 ) /e).

There are some changes in the code

Let C ∈ R n × m , f ∈ R n , g ∈ R m \mathbf{C}\in\mathbb{R}^{n\times m}, \mathbf{f}\in\mathbb{R}^ n, \mathbf{g}\in\mathbb{R}^mCRn×m,fRn,gRm

log ⁡ ( K e g / ε ) = log ⁡ ( [ ∑ j e − C i , j − g j ε ] i ) = log ⁡ ( [ ∑ j e − C i , j − g j ε e f i ε e − f i ε ] i ) = log ⁡ ( [ ∑ j e − C i , j − f i − g j ε ] i ⊙ e − f ε ) = log ⁡ ( [ ∑ j e − C i , j − f i − g j ε ] i ) − f ε = logsumexp ⁡ ( − C − f T − g ε , d i m = − 1 ) − f ε \begin{aligned} &\log \left(\mathbf{K} e^{\mathbf{g} / \varepsilon}\right)\\ =&\log\left(\left[\sum_{j}e^{-\frac{C_{i,j}-g_j}{\varepsilon}}\right]_i\right)\\ =&\log\left(\left[\sum_{j}e^{-\frac{C_{i,j}-g_j}{\varepsilon}}e^{\frac{f_i}{\varepsilon}}e^{-\frac{f_i}{\varepsilon}}\right]_i\right)\\ =&\log\left(\left[\sum_{j}e^{-\frac{C_{i,j}-f_i-g_j}{\varepsilon}}\right]_i\odot e^{-\frac{\mathbf{f}}{\varepsilon}}\right)\\ =&\log\left(\left[\sum_{j}e^{-\frac{C_{i,j}-f_i-g_j}{\varepsilon}}\right]_i\right)-\frac{\mathbf{f}}{\varepsilon}\\ =&\operatorname{logsumexp}\left(-\frac{\ mathbf{C}-\mathbf{f}^T-\mathbf{g}}{\valuepsilon},dim=-1\right)-\frac{\mathbf{f}}{\valuepsilon}\\ \end{ aligned}=====log( E.g _g / e )log [jeeCi,jgj]i log [jeeCi,jgjeefieefi]i log [jeeCi,jfigj]ieef log [jeeCi,jfigj]i eflogsumexp(eCfTg,dim=1)ef
The last step, vector and matrix addition involves the broadcast mechanism

log ⁡ ( K T e f / ε ) = log ⁡ ( [ ∑ i e − C i , j − f i ε ] j ) = log ⁡ ( [ ∑ i e − C i , j − f i ε e g j ε e − g j ε ] j ) = log ⁡ ( [ ∑ i e − C i , j − f i − g j ε ] j ⊙ e − g ε ) = log ⁡ ( [ ∑ i e − C i , j − f i − g j ε ] j ) − g ε = logsumexp ⁡ ( − C − f T − g ε , d i m = − 2 ) − g ε = logsumexp ⁡ ( − ( C − f T − g ) T ε , d i m = − 1 ) − g ε \begin{aligned} &\log \left(\mathbf{K}^{\mathrm{T}} e^{\mathbf{f} / \varepsilon}\right)\\ =&\log\left(\left[\sum_{i}e^{-\frac{C_{i,j}-f_i}{\varepsilon}}\right]_j\right)\\ =&\log\left(\left[\sum_{i}e^{-\frac{C_{i,j}-f_i}{\varepsilon}}e^{\frac{g_j}{\varepsilon}}e^{-\frac{g_j}{\varepsilon}}\right]_j\right)\\ =&\log\left(\left[\sum_{i}e^{-\frac{C_{i,j}-f_i-g_j}{\varepsilon}}\right]_j\odot e^{-\frac{\mathbf{g}}{\varepsilon}}\right)\\ =&\log\left(\left[\sum_{i}e^{-\frac{C_{i,j}-f_i-g_j}{\varepsilon}}\right]_j\right)-\frac{\mathbf{g}}{\varepsilon}\\ =&\operatorname{logsumexp}\left(-\frac{\ mathbf{C}-\mathbf{f}^T-\mathbf{g}}{\valuepsilon},dim=-2\right)-\frac{\mathbf{g}}{\valuepsilon}\\ =&\ operatorname{logsumexp}\left(-\frac{\left(\mathbf{C}-\mathbf{f}^T-\mathbf{g}\right)^T}{\varepsilon},dim=-1\right )-\frac{\mathbf{g}}{\valuepsilon}\\ \end{aligned}======log(KThat's itf / e )log [ieeCi,jfi]j log [ieeCi,jfieegjeegj]j log [ieeCi,jfigj]jeeg log [ieeCi,jfigj]j eglogsumexp(eCfTg,dim=2)eglogsumexp(e(CfTg)T,dim=1)eg
Insert image description here

Guess you like

Origin blog.csdn.net/qq_39942341/article/details/131751760