suffix array - poj 1743

Topic link: Click to open the link

Description

A musical melody is represented as a sequence of N (1<=N<=20000)notes that are integers in the range 1..88, each representing a key on the piano. It is unfortunate but true that this representation of melodies ignores the notion of musical timing; but, this programming task is about notes and not timings. 
Many composers structure their music around a repeating &qout;theme&qout;, which, being a subsequence of an entire melody, is a sequence of integers in our representation. A subsequence of a melody is a theme if it: 
  • is at least five notes long 
  • appears (potentially transposed -- see below) again somewhere else in the piece of music 
  • is disjoint from (i.e., non-overlapping with) at least one of its other appearance(s)

Transposed means that a constant positive or negative value is added to every note value in the theme subsequence. 
Given a melody, compute the length (number of notes) of the longest theme. 
One second time limit for this problem's solutions! 

Input

The input contains several test cases. The first line of each test case contains the integer N. The following n integers represent the sequence of notes. 
The last test case is followed by one zero. 

Output

For each test case, the output file should contain a single line with a single integer that represents the length of the longest theme. If there are no themes, output 0.

Sample Input

30
25 27 30 34 39 45 52 60 69 79 69 60 52 45 39 34 30 26 22 18
82 78 74 70 66 67 64 60 65 80
0

Sample Output

5

There is a musical note represented by numbers that allows you to find a string that satisfies the following requirements:
1: length is at least 5
2:在音乐中重复出现
3:要求重复出现的音符不能有重叠
因为题目给的是音符,而真正要求的是相邻两个音符的差值,例如有一段音符是2 4 6 8 10 22 24 26 28 30,此情况就满足题意,它们的波动是一样的,都是2.
实质就是求不可重叠最长重复字串。
如果求可重叠最长重复字串的话:
做法比较简单,只需要求 height 数组里的最大值即可。首先求最长重复子串,等价于求两个后缀的最长公共前缀的最大值。因为任意两个后缀的最长公共前缀都是 height 数组里某一段的最小值,那么这个值一定不大于 height 数组里的最大值。所以最长重复子串的长度就是height 数组里的最大值。这个做法的时间复杂度为 O(n)。
这是选自IOI2009 国家集训队论文的罗穗骞的“处理字符串的有力工具“的文章。

求不可重叠最长重复字串要比重叠稍微麻烦一点。此题做法是二分求解0到n/2的最优答案,求出之再后判断长度是否大于5。每次二分的中间长度是mid,然后判断是否存在大于mid的情况,因为长度是mid,而且要求不可重叠,那么就要判断每个后缀sa值的最大和最小之差是否大于等于mid,因为它们之差就是重复字串的长度,倘若不大于,肯定重叠了,如果有一组满足此情况,那么这个mid就符合情况。



代码:

#include <cstdio>
#include <iostream>
using namespace std;
const int maxn = 20007;

int sa[maxn],rank[maxn],height[maxn];
int wa[maxn],wb[maxn],wv[maxn];
int Ws[maxn];
int num[maxn],s[maxn];

int cmp (int *r, int a,int b, int l){
    return r[a] == r[b] && r[a + l] == r[b + l];
}

void get_sa (int *r,int n,int m){
    int p, *x = wa,*y = wb,*t, i, j;
    for ( i = 0; i < m; i++) Ws[i] = 0;
    for ( i = 0; i < n; i++) Ws[x[i] = r[i]]++;
    for ( i = 1; i < m; i++) Ws[i] += Ws[i - 1];
    for ( i = n - 1; i>= 0; i--) sa[--Ws[x[i]]] = i;
    for ( j = 1, p = 1; p < n; j *= 2, m = p){
        for (p = 0, i = n - j; i < n; i++) y[p++] = i;
        for ( i = 0; i < n; i++) if (sa[i] >= j) y[p++] = sa[i] - j;
        for ( i = 0; i < n; i++) wv[i] = x[y[i]];
        for ( i = 0; i < m; i++) Ws[i] = 0;
        for ( i = 0 ;i < n; i++) Ws[wv[i]]++;
        for ( i = 0 ;i < m; i++) Ws[i] += Ws[i - 1];
        for ( i = n - 1; i >= 0; i--) sa[--Ws[wv[i]]] = y[i];
        for (t = x, x = y, y = t, p = 1, x[sa[0]] = 0,i = 1; i < n; i++){
            x[sa[i]] = cmp(y,sa[i - 1],sa[i],j) ? p - 1 : p++;
        }
    }
}

void get_height (int *r,int n){
    int k = 0 ,j;
    for (int i = 1 ; i<= n; i++) rank[sa[i]] = i;
    for (int i = 0; i< n; height[rank[i++]] = k){
        for (k ? k-- : 0, j = sa[rank[i] - 1]; r[i + k] == r[j + k]; k++);
    }
}

int solve (int n){
    int Max = n / 2, Min = 0,mid,flag ;
    while (Min <= Max){
        mid = (Max + Min) / 2;
        int low = sa[1], high = sa[1];
        flag = 0;
        for (int i = 2; i < n; i++){
            if (height[i] < mid){
                low = sa[i];
                high = sa[i];
            }
            else {
                low = min (low,sa[i]);
                high = max (high,sa[i]);
                if (high - low >= mid) {
                    flag = 1;
                    break;
                }
            }
        }
        if (flag) Min = mid + 1;
        else Max = mid - 1;
    }
    return Max >= 4 ? Max + 1 : 0;
}

int main (){
    int n;
    while (~scanf("%d",&n) && n){
        for (int i = 0; i < n; i++){
            scanf("%d",&s[i]);
        }
        if (n < 10) {
            printf("0\n");
            continue ;
        }
        for (int i = 0; i < n -1; i++){
            num[i] = (s[i+1] - s[i]) + 90;
        }
        int m = 180;
        get_sa(num,n,m);
        get_height(num,n - 1);
        printf("%d\n",solve(n - 1));
    }
    return 0;
}

本人理解也不是很深入,还在学习中,写一个博客,日后还可以重新思考思考,

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325885003&siteId=291194637