Suffix array entry

Bloggers to sleep, to continue tomorrow

Suffix array definition:

Suffix Array (Suffix Array) refers to a string of all suffixes according to the dictionary to sort the array to get back. The array only stores starting position suffix .

Suffix: String substring of a string from a start position to the end thereof, comprising the original string and an empty string.

Examples: {ABC} suffix {ABC}, {BC}, {C}, {}

Dictionary Sort by: Default from small to large

Suffix array construction:

  • Simple approach: the n-th sort sorted strings, the time complexity \ (O (n ^ 2log_2n) \)

  • Multiplier array method : Manber and Myers invention, the need for \ (log_2n \) times, the sort time complexity \ (O (nlog_2n) \) , so the total time complexity is \ (O (nlog_2 ^ 2N) \) , can will be optimized for sorting sort radix sort, the total time complexity is optimized to \ (O (nlog_2n) \) .

  • So, in general, the method of doubling the array enough, can go faster [ SA algorithm for the IS- ]

Multiplying the code array:

Not optimized code

#include <cstdio>
#include <cstring>
#include <algorithm>
#define MAXN 1000
using namespace std;

char str[MAXN];//字符串数组
int sa[MAXN + 1];//后缀数组,+1是为了存储(空字符串)
int rank[MAXN + 1];//Rank[i]第i位开始的子串排名(0~N)
int tmp[MAXN+1];
int k,n;

bool cmp_sa(const int &i,const int &j) {
    if(rank[i] != rank[j])  return rank[i]<rank[j];
    else {
        int l = n-i>=k?rank[i+k]:-1;
        int r = n-j>=k?rank[j+k]:-1;
        return l<r;
    }
    return true;
}

void build_sa(const char* str,int *sa) {
    n = strlen(str);
    //长度为1的sa,rank取编码,因为空字符串排最前,所以取-1
    for(int i=0; i<=n; i++) {
        sa[i] = i;
        rank[i] = rank[i] < n? str[i]:-1;
    }
    //用长度为i的Rank求长度为k的Rank
    for(k=1; k<=n; k*=2) {
        sort(sa,sa+n+1,cmp_sa);
        tmp[sa[0]] = 0;
        for(int i=1; i<=n; i++) {//计算Rank
            tmp[sa[i]] = tmp[sa[i-1]] + (cmp_sa(sa[i-1],sa[i])?1:0);
        }
        for(int i=0; i<=n; i++) {
            rank[i] = tmp[i];
        }
    }
}

int main() {
    scanf("%s",&str);
    build_sa(str,sa);
    return 0;
}

topic:

Reference blog and literature:

https://www.cnblogs.com/jinkun113/p/4743694.html
https://www.cnblogs.com/victorique/p/8480093.html
challenge Programming Contest (2nd Edition)

Guess you like

Origin www.cnblogs.com/--zz/p/11144860.html