Algorithm 06-Analysis and Implementation of LCS Algorithm, KMP Algorithm and Floyd Algorithm

Algorithm 06 - Dynamic Programming

1. Introduction

Dynamic programming is a branch of operations research and a mathematical method for solving the optimization of decision process. It transforms a multi-stage process into a series of single-stage problems, which are solved one by one using the relationships between the stages.

Application of dynamic programming: Fibonacci sequence, LCS algorithm, KMP algorithm, floyd algorithm

The Fibonacci sequence

1. Description

When explaining recursion, we implemented the Fibonacci sequence using recursion. Now use the idea of ​​dynamic programming to implement the Fibonacci sequence.

2. Code implementation

public static double fibonacciSequence(int n) {
    double[] array = new double[n];
    //当n=0或1时,直接返回1
    array[0] = 1;
    array[1] = 1;
    //当n>1时,返回前两个数的和
    for (int i = 2; i < array.length; i++) {
        array[i] = array[i - 1] + array[i - 2];
    }
    return array[n - 1];
}

3. LCS algorithm

1. Basic idea

A sequence A deletes several characters arbitrarily to obtain a new sequence B, then A is called a subsequence of B.

LCS is the abbreviation of Longest Common Subsequence, that is, the longest common subsequence. Among the common subsequences of two sequences X and Y, the one with the longest length is defined as the longest common subsequence of X and Y.

Operation steps of LCS:

  1. Declare the sequences X and Y, declare a two-dimensional array f[i][j]Represent the length of the longest common subsequence before the i-bit of X and the j-bit of Y, declare a function same(a,b) when the a-th position of X is the same as the Y-th position "1" when the b bits are exactly the same, otherwise "0";
  2. Populate a 2D array f[i][j]. When i=0 and j=0, f[0][0]= same(0,0); otherwise, f[i][j]= max{ f[i-1][j-1]+ same(i,j), f[i-1][j], f[i][j-1]};
  3. backtrace array. To declare a sequence Z as the longest common subsequence of X and Y, compare X[i] with Y[j]:
    • If they are equal, it means they belong to Z, add them to Z, and then i--, j--, continue to compare X[i] and Y[j];
    • If it is not equal, and f[i][j - 1] > f[i - 1][j], indicating that X[j] does not belong to Z, then j--continue to compare X[i] and Y[j];
    • If it is not equal, and f[i][j - 1] < f[i - 1][j], indicating that X[i] does not belong to Z, then i--continue to compare X[i] and Y[j];
    • If not equal, and f[i][j - 1] = f[i - 1][j], indicate that X[i] does not belong to Z or X[j] does not belong to Z, then i--or j--, continue to compare X[i] and Y[j];
    • When i=0 and j=0, the comparison is completed, and Z is reversed, that is, the longest common subsequence of X and Y.

Application of LCS: Comparison of similarity (such as paternity test, graphical similarity comparison).

2. Code implementation

/**
 * @param x 序列X
 * @param y 序列Y
 * @return
 */
public static String getLCS(String x, String y) {
    if (Tool.isEmpty(x) || Tool.isEmpty(x)) {
        return null;
    }
    char[] a = x.toCharArray();
    char[] b = y.toCharArray();
    int[][] arr = new int[a.length][b.length];
    // 使用动态规划的方式填入数据:相同取左上+1,不同去Max(左,上);
    for (int i = 0; i < arr.length; i++) {
        for (int j = 0; j < arr[i].length; j++) {
            //a[i]=a[j]则公共子序列的长度+1,否则+0
            int length = a[i] == b[j] ? 1 : 0;
            if (i == 0 && j == 0) {
                arr[i][j] = length;
            } else {
                //左边位置对应的最长公共子序列的长度
                int left = i == 0 ? 0 : arr[i - 1][j];
                //上边位置对应的最长公共子序列的长度
                int top = j == 0 ? 0 : arr[i][j - 1];
                //左上位置对应的最长公共子序列的长度
                int lt = (i == 0 || j == 0) ? 0 : arr[i - 1][j - 1];
                arr[i][j] = Math.max(left, top);
                arr[i][j] = Math.max(lt + length, arr[i][j]);
            }
        }
    }

    //最长公共子序列
    StringBuffer sb = new StringBuffer();
    int i = a.length - 1;
    int j = b.length - 1;
    while (i >= 0 && j >= 0) {
        //左边位置对应的最长公共子序列的长度
        int left = i == 0 ? 0 : arr[i - 1][j];
        //上边位置对应的最长公共子序列的长度
        int top = j == 0 ? 0 : arr[i][j - 1];
        if (a[i] == b[j]) {
            sb.append(a[i]);
            i--;
            j--;
            //上边大就继续比较上边
        } else if (top > left) {
            j--;
            //左边大就继续比较左边
        } else {
            i--;
        }
    }
    return sb.reverse().toString();
}

Fourth, KMP algorithm

1. Basic idea

definition

The KMP algorithm is an improved string matching algorithm, which uses the information after the matching fails to minimize the matching times between the pattern string and the main string to achieve the purpose of fast matching. In the KMP algorithm, for each pattern string, we will calculate the internal matching information of the pattern string in advance, and move the largest pattern string when the matching fails to reduce the number of matches.

illustrate

After reading a lot of information on the Internet, I feel that the explanation is not clear. Then I thought about breaking my head, and I felt it was clear. So share it with you all!

If you like it, please give it a thumbs up, and if you don't like it, just spray it, hahaha….

analyze

  1. Declare the target string T and the pattern string W, and declare the array M to store the internal matching information of W;
  2. Matching with W starts from the i bit of T, and i=0 at the beginning. If the kth bit of W does not match T, then i+=M[k]; then continue to match W starting from bit i of T until T matches W.
  3. Step 2 i+=M[k]is the core of the KMP algorithm, so it will be explained at the end: the kth bit of W does not match T. If the brute force method is used, T will be shifted to the right by 1 bit, that is i++, and then continue to match, but this way Efficiency is low. The kth bit of W does not match T, which means that the front k-1bit of W matches T. If the front k-1bit of W is shifted to the right by x bits, the front bit of W and W k-1still do not match, which is equivalent to shifting T to the right. After x bits, T and W still do not match.
  4. Now we need to find the maximum value of this x, that is, how many bits T is shifted to the right at most. We use M[k] to represent the maximum number of bits that T is shifted to the right when the kth bit of W does not match T. When k=0, T must be shifted one place to the right, so M[0]=1; when k=1, since M[k] cannot be greater than k, M[1]=1; when k> 1, declare m and j, let m=j=0, and then compare:
    1. If W[m]!=W[j], it means that W does not match with T at the beginning, T can be shifted to the right by one, so let j=0, m++, M[m]=M[m-1]+1, and then continue to compare;
    2. If W[m]=W[j], it means that after T is shifted to the right by 1 bit, the first bit of T is equal to the first bit of W, so it is necessary to compare the next bit, let j++, m++, M[m]=M[m-1], and then continue to compare;
    3. If it appears W[m]=W[j]after it appears W[m]!=W[j], then it means that the first bit of W M[m-1]does not match T, so M[m]=M[m-1]+1let j=0, m++and then continue to compare;
    4. When m = the length of W, the comparison is completed.

2. Code implementation

/**
 * @param T 目标串
 * @param W 模式串
 * @return 模式串在目标中匹配的起点
 */
public static int KMP(String T, String W) {
    if (T == null || W == null) {
        return -1;
    }
    //模式串W在目标串T中匹配的起点
    int index = -1;

    //用数组M来存储W的内部匹配信息,M[i]表示
    int[] M = new int[W.length()];
    M[0] = M[1] = 1;
    for (int i = 2, j = 0; i < M.length; i++) {
        if (W.charAt(i) != W.charAt(j)) {
            j = 0;
            M[i] = M[i - 1] + 1;
        } else {
            j++;
            M[i] = M[i - 1];
        }
        ToolShow.log(i + ":" + M[i]);
    }
    //循环比较
    for (int i = 0; i < T.length(); ) {
        //从第i位开始匹配
        for (int j = i, k = 0; j < T.length(); j++) {
            if (T.charAt(j) == W.charAt(k)) {
                if (k < M.length - 1) {
                    k++;
                } else {
                    index = i;
                    return index;
                }
            } else {
                i += M[k];
                ToolShow.log("k:" + k + "  i: " + i);
                break;
            }
        }
    }
    return index;
}

Five, floyd algorithm


1. Basic idea

definition

Floyd's algorithm, also known as interpolation method, is an algorithm that uses the idea of ​​dynamic programming to find the shortest path between multiple source points in a given weighted graph, similar to Dijkstra's algorithm. Dijkstra's algorithm can only get the shortest path, and Floyd's algorithm can also get the route.

Floyd's algorithm uses intermediate points to update the shortest path between two points. When all intermediate points are compared, the shortest path of the graph is obtained.

The time complexity of the Floyd algorithm is very large, and it is suitable for scenarios that are used for a long time in one operation, such as finding the shortest path in the bus system.

analyze

First, get the shortest path:

  1. There is a weighted graph G(V, E), where V is the vertex set, and E is the adjacency matrix of G. Declare a two-dimensional array path to record the route. Initially, path(j,k) =kit is equivalent to V[j] that can directly reach V[k].
  2. Take V[0] as the middle point for the first time, compare E(j,k) and E(j,0)+E(0,k), if E(j,k)>E(j,0) +E(0,k), then path(j,k) = 0, E(j,k)=E(j,0)+E(0,k), that is, among the three points of V[j], V[0], V[k] , the shortest distance between V[j] and V[k] is E(j,k), which is equivalent to the direct connection between V[j] and V[k], and then E(j,k) is their path.
  3. Take V[1] as the middle point for the second time, compare E(j,k) and E(j,1)+E(1,k), if E(j,k)>E(j,1) +E(1,k), then path(j,k) = 1, E(j,k)=E(j,1)+E(1,k), namely V[j], V[0], V[1], V[k ] Among the four points, the shortest distance between V[j] and V[k] is E(j,k).
  4. By analogy, after updating all intermediate points, the shortest path of G is obtained.

Then, by pushing back, the route map corresponding to the shortest path of G is obtained:

  • If path(j,k) =k, it means that V[j] can directly reach V[k];
  • If e=path(j,k),而e!=k, it means that the intermediate point V[e] is passed;
  • Then judge path(j,e)and respectively path(e,k), know that the direct route is found, and then connect all the intermediate routes, which is the shortest route of V[j] and V[k].

2. Code implementation

/**
 * @param matrix 最短路径矩阵
 * @return 最终路线结果
 */
public static List<String> floyd(int[][] matrix) {
    if (Tool.isEmpty(matrix)) {
        return null;
    }
    //最短路线矩阵:不包含最终路线
    int[][] path = new int[matrix.length][matrix.length];
    //初始化path
    for (int i = 0; i < matrix.length; i++) {
        int[] mm = matrix[i];
        for (int j = 0; j < mm.length; j++) {
            path[i][j] = j;
        }
    }

    //以i为中间点,更新最短路径
    for (int i = 0; i < matrix.length; i++) {
        for (int j = 0; j < matrix.length; j++) {
            for (int k = 0; k < matrix.length; k++) {
                if (matrix[j][k] > matrix[j][i] + matrix[i][k]) {
                    //保存最短路径
                    matrix[j][k] = matrix[j][i] + matrix[i][k];
                    //保存最短路线
                    path[j][k] = i;
                }
            }
        }
    }

    //最终路线结果
    List<String> pathList = getPathFloyd(matrix, path);
    return pathList;
}

/**
 * 根据最最短路径矩阵和最短路线矩阵回推最终路线
 * @param matrix 最短路径矩阵
 * @param path   最短路线矩阵:不包含最终路线
 * @return 最终路线结果
 */
public static List<String> getPathFloyd(int[][] matrix, int[][] path) {
    //最终路线结果
    List<String> pathList = new ArrayList<>();
    for (int i = 0; i < path.length; i++) {
        for (int j = 0; j < path.length; j++) {
            //从i到j的最短路径
            int m = matrix[i][j];
            String s = "从" + i + "到 " + j + ": 路径为 " + m + ",路线为 " + i;
            //从i到j的最短路线
            s += getPathFloyd(path, i, j) + "\n";
            pathList.add(s);
        }
    }
    return pathList;
}

/**
 * 回推某个点的路径
 * @param path 最短路线矩阵:不包含最终路线
 * @param from 起点
 * @param to   终点
 * @return 最终路线
 */
public static String getPathFloyd(int[][] path, int from, int to) {
    if (path == null || path.length == 0) {
        return null;
    }
    // from到to的中间点
    int m = path[from][to];
    String s = "";
    //如果直接到达
    if (m == to) {
        s += " -> " + to;
        //如果通过中间点到达
    } else {
        s += getPathFloyd(path, from, m);
        s += getPathFloyd(path, m, to);
    }
    return s;
}

The demo has been uploaded to gitee, and students who need it can download it!

Previous: Algorithm 05-Reduce and Conquer

Next: Algorithm 07-A-star

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325432308&siteId=291194637