C#,数值计算——多维下坡单纯形法(Downhill Simplex Method in Multidimensions)的计算方法与源程序

1 Downhill Simplex Method

A simplex is defined as a body in n dimensions consisting of n+1 vertices. Specifying the location of each vertex fully defines the simplex. In two dimensions, the simplex is a triangle. In three dimensions, it is a tetrahedron. As the algorithm proceeds, the simplex makes its way downward toward the location of the minimum through a series of steps. These steps can be divided into reflections, expansions, and contractions. Most steps are reflections, which consist of moving the vertex of the simplex where the objective function is largest (worst) through the opposite face of the simplex to a lower (better) point. Reflections maintain the volume of the simplex. When possible, an expansion can accompany the reflection to increase the size of the simplex and speed convergence by allowing larger steps. Conversely, contractions “shrink” the simplex, allowing it to settle into a minimum or pass through a small opening like the neck of an hourglass. This method has the highest probability of finding the global minimum when it is started with big initial steps. The initial simplex will span a greater fraction of the design space, and the chances of getting trapped in a local minimum are smaller. However, for complex hyper-dimensional topographies, the method can break down.

2 单纯形的几何意义

单纯形被定义为n尺寸包括n+1.顶点。指定每个顶点的位置可以完全定义单纯形。在二维中,单纯形是一个三角形。在三维中,它是一个四面体。随着算法的进行,单纯形通过一系列步骤向下到达最小值的位置。这些步骤可以分为反射、展开和收缩。大多数步骤都是反射,包括将目标函数最大(最差)的单纯形的顶点通过单纯形的相对面移动到较低(较好)的点。反射保持单纯形的体积。在可能的情况下,扩展可以伴随反射来增加单纯形的大小,并通过允许更大的步长来加快收敛。相反,收缩会“收缩”单工,使其稳定在最小值或穿过沙漏颈这样的小孔。当以较大的初始步骤开始时,该方法具有找到全局最小值的最高概率。初始单纯形将跨越设计空间的较大部分,并且陷入局部最小值的可能性较小。然而,对于复杂的超维拓扑,该方法可能会失败。

3 Downhill Simplex Algorithm

The downhill simplex algorithm was invented by Nelder and Mead. It is a method to find the minimum of a function in more than one independent variables, solving multi-dimensional optimzation problems. The method only requires function evaluations, no derivatives.

The simplex method is a method for solving problems in linear programming. This method, invented by George Dantzig in 1947, tests adjacent vertices of the feasible set (which is a polytope) in sequence so that at each new vertex the objective function improves or is unchanged. The simplex method is very efficient in practice, generally taking 2m to 3m iterations at most (where m is the number of equality constraints), and converging in expected polynomial time for certain distributions of random inputs (Nocedal and Wright 1999, Forsgren 2002). However, its worst-case complexity is exponential, as can be demonstrated with carefully constructed examples (Klee and Minty 1972).

A different type of methods for linear programming problems are interior point methods, whose complexity is polynomial for both average and worst case. These methods construct a sequence of strictly feasible points (i.e., lying in the interior of the polytope but never on its boundary) that converges to the solution. Research on interior point methods was spurred by a paper from Karmarkar (1984). In practice, one of the best interior-point methods is the predictor-corrector method of Mehrotra (1992), which is competitive with the simplex method, particularly for large-scale problems.

Dantzig's simplex method should not be confused with the downhill simplex method (Spendley 1962, Nelder and Mead 1965, Press et al. 1992). The latter method solves an unconstrained minimization problem in n dimensions by maintaining at each iteration n+1 points that define a simplex. At each iteration, this simplex is updated by applying certain transformations to it so that it "rolls downhill" until it finds a minimum.
 

下坡单纯形算法是由Nelder和Mead发明的。它是一种在多个自变量中求函数最小值的方法,用于求解多维优化问题。该方法只需要函数求值,不需要导数。

单纯形法是求解线性规划问题的一种方法。这种方法由George Dantzig于1947年发明,按顺序测试可行集(这是一个多面体)的相邻顶点,以便在每个新的顶点上,目标函数都会改进或保持不变。单纯形法在实践中非常有效,通常最多需要2到3米的迭代(其中m是等式约束的数量),并且在随机输入的某些分布的预期多项式时间内收敛(Nocedal和Wright 1999,Forsgren 2002)。然而,正如精心构建的例子所证明的那样,其最坏情况下的复杂性是指数级的(Klee和Minty 1972)。

线性规划问题的另一种方法是内点法,其复杂度在平均和最坏情况下都是多项式。这些方法构造了一系列严格可行的点(即,位于多面体内部,但从不在其边界上),这些点收敛于解。Karmarkar(1984)的一篇论文推动了对内点方法的研究。在实践中,最好的内点方法之一是Mehrotra(1992)的预测-校正方法,它与单纯形方法有竞争力,尤其是对于大规模问题。

不应将Dantzig的单纯形法与下坡单纯形法混淆(Spendley 1962,Nelder和Mead 1965,Press等人1992)。后一种方法通过在每次迭代中保持n+1个定义单纯形的点来解决n维中的无约束最小化问题。在每次迭代中,这个单纯形都会通过对其应用某些变换来更新,这样它就会“向下滚动”,直到找到最小值。

4 Nelder-Mead (Downhill Simplex Method)

Nelder-Mead (Downhill Simplex Method) 算法最早由 Jone Nelder 和 Roger Mead 于 1965 年提出,是一种基于启发式规则的优化算法,类似常见的遗传算法(Generic Algorithm,GA)和粒子群算法(Particle Swarm Optimization,PSO),通过人为设计的一系列规则,从初始值出发,迭代寻找最优解。像众多启发式算法一样,Nelder-Mead 不需要了解函数的具体形式,不利用梯度信息,同样也无法保证结果的最优性,只能找到“较”好的可行解。

5 单纯形算法的C#源程序

using System;

namespace Legalsoft.Truffer
{
    /// <summary>
    /// 多维下坡单纯形法
    /// Downhill Simplex Method in Multidimensions
    /// </summary>
    public class Amoeba
    {
        private int nfunc { get; set; }
        private int mpts { get; set; }
        private int ndim { get; set; }
        public double fmin { get; set; }
        private double ftol { get; }
        private double[] y { get; set; }
        private double[,] p { get; set; }

        public Amoeba(double ftoll)
        {
            this.ftol = ftoll;
        }

        public double[] minimize(double[] point, double del, RealValueFun func)
        {
            double[] dels = new double[point.Length];
            for (int i = 0; i < dels.Length; i++)
            {
                dels[i] = del;
            }
            return minimize(point, dels, func);
        }

        public double[] minimize(double[] point, double[] dels, RealValueFun func)
        {
            int ndim = point.Length;
            double[,] pp = new double[ndim + 1, ndim];
            for (int i = 0; i < ndim + 1; i++)
            {
                for (int j = 0; j < ndim; j++)
                {
                    pp[i, j] = point[j];
                }
                if (i != 0)
                {
                    pp[i, i - 1] += dels[i - 1];
                }
            }
            return (minimize(pp, func));
        }

        public double[] minimize(double[,] pp, RealValueFun func)
        {
            const int NMAX = 5000;
            const double TINY = 1.0e-10;

            mpts = pp.GetLength(0);
            ndim = pp.GetLength(1);
            double[] psum = new double[ndim];
            double[] pmin = new double[ndim];
            double[] x = new double[ndim];
            p = pp;
            //y.resize(mpts);
            y = new double[mpts];
            for (int i = 0; i < mpts; i++)
            {
                for (int j = 0; j < ndim; j++)
                {
                    x[j] = p[i, j];
                }
                y[i] = func.funk(x);
            }
            nfunc = 0;
            get_psum(p, psum);
            for (; ; )
            {
                int ilo = 0;
                int ihi = y[0] > y[1] ? (0) : (1);
                int inhi = y[0] > y[1] ? (1) : (0);
                for (int i = 0; i < mpts; i++)
                {
                    if (y[i] <= y[ilo])
                    {
                        ilo = i;
                    }
                    if (y[i] > y[ihi])
                    {
                        inhi = ihi;
                        ihi = i;
                    }
                    else if (y[i] > y[inhi] && i != ihi)
                    {
                        inhi = i;
                    }
                }
                double rtol = 2.0 * Math.Abs(y[ihi] - y[ilo]) / (Math.Abs(y[ihi]) + Math.Abs(y[ilo]) + TINY);
                if (rtol < ftol)
                {
                    Globals.SWAP(ref y[0], ref y[ilo]);
                    for (int i = 0; i < ndim; i++)
                    {
                        Globals.SWAP(ref p[0, i], ref p[ilo, i]);
                        pmin[i] = p[0, i];
                    }
                    fmin = y[0];
                    return (pmin);
                }
                if (nfunc >= NMAX)
                {
                    throw new Exception("NMAX exceeded");
                }
                nfunc += 2;
                double ytry = amotry(p, y, psum, ihi, -1.0, func);
                if (ytry <= y[ilo])
                {
                    ytry = amotry(p, y, psum, ihi, 2.0, func);
                }
                else if (ytry >= y[inhi])
                {
                    double ysave = y[ihi];
                    ytry = amotry(p, y, psum, ihi, 0.5, func);
                    if (ytry >= ysave)
                    {
                        for (int i = 0; i < mpts; i++)
                        {
                            if (i != ilo)
                            {
                                for (int j = 0; j < ndim; j++)
                                {
                                    p[i, j] = psum[j] = 0.5 * (p[i, j] + p[ilo, j]);
                                }
                                y[i] = func.funk(psum);
                            }
                        }
                        nfunc += ndim;
                        get_psum(p, psum);
                    }
                }
                else
                {
                    --nfunc;
                }
            }
        }

        public void get_psum(double[,] p, double[] psum)
        {
            for (int j = 0; j < ndim; j++)
            {
                double sum = 0.0;
                for (int i = 0; i < mpts; i++)
                {
                    sum += p[i, j];
                }
                psum[j] = sum;
            }
        }

        public double amotry(double[,] p, double[] y, double[] psum, int ihi, double fac, RealValueFun func)
        {
            double[] ptry = new double[ndim];
            double fac1 = (1.0 - fac) / ndim;
            double fac2 = fac1 - fac;
            for (int j = 0; j < ndim; j++)
            {
                ptry[j] = psum[j] * fac1 - p[ihi, j] * fac2;
            }
            double ytry = func.funk(ptry);
            if (ytry < y[ihi])
            {
                y[ihi] = ytry;
                for (int j = 0; j < ndim; j++)
                {
                    psum[j] += ptry[j] - p[ihi, j];
                    p[ihi, j] = ptry[j];
                }
            }
            return ytry;
        }
    }
}
 

猜你喜欢

转载自blog.csdn.net/beijinghorn/article/details/132128374