[Data structure] string (sequential storage of string and block chain storage)

Article Directory


Preface

String (character string) is a special linear table whose data element consists of only one character. Under normal circumstances, the non-numeric data objects processed are often character string data. For example, in transaction processing, the customer's name, address, and origin of goods are generally treated as character strings. Usually "the whole string" is the processing object.


One, the definition of string

Definition of string:

A string is a sequence of characters composed of zero or more arbitrary characters.

Usually recorded as: s="a1,a2,a3,...,an"

  • s is the name of the string, and the value enclosed in double quotes is the string value.
  • The number n of characters in the string is called the length of the string.
  • A string of zero characters is called a null string, and the length of the string is 0, which is represented by ∅.

Space string: A string consisting of one or more spaces.

Substring and main string: The subsequence composed of any consecutive characters in the string is called the substring of the string. The string containing the substring is called the main string.

The position of the substring: the serial number of the first character of the substring in the main string.

Strings are equal: the lengths of the two strings are the same and the characters in the corresponding positions are the same.


Second, the storage structure of the string

1. The sequential storage structure of the string

The sequential storage structure of a string uses a set of memory cells with consecutive addresses to store character sequences. According to the predefined size, a fixed-length storage area is allocated for each defined string variable. It is generally defined by a fixed-length array.

 The actual length of the string can be arbitrarily within the pre-defined length range, string values ​​exceeding the pre-defined length will be discarded, which is called "truncated". 

2. The fixed-length sequential storage and operation of strings

/*定义方法1*/
#define MAXSIZE 255
typedef char SString[MAXSIZE+1];//0号单元存放串的长度

//定义方法2
/*
typedef struct
{
    char data[MAXSIZE];
    int curlen;
}SeqString;
SeqString s;
*/

/*串连接*/
/*
假设S1,S2,T都是SString型的串变量,串T是由串S1联结串S2得到。
基于S1和S2长度的不同情况,串T可能有3种情况:
(1)S1[0]+S2[0]=<MAXSIZE
(2)S1[0]<MAXSIZE,S1[0]+S2[0]>MAXSIZE,则S2被部分截断
(3)S1[0]>=MAXSIZE,则只包含S1
*/
int Concat(SString &T,SString S1,SString S2)
{
    int flag=1;//标志是否截断,1表示没截断,0表示截断
    int j,k;
    j=1,k=1;
    if(S1[0]+S2[0]=<MAXSIZE){
        while(S1[j]!='\0'){
            T[k++]=S1[j++];
        }
        j=1;
        while(S2[j]!='\0'){
            T[k++]=S2[j++];
        }
        T[0]=S1[0]+S2[0];
    }
    else if(S1[0]<MAXSIZE){
        while(S1[j]!='\0'){
            T[k++]=S1[j++];
        }
        j=1;
        while(k<MAXSIZE){
            T[k++]=S2[j++];
        }
        T[0]=MAXSIZE;
        flag=0;
    }
    else{
         while(S1[j]!='\0'){
            T[k++]=S1[j++];
        }
        T[0]=MAXSIZE;
        flag=0;
    }
    return flag;
}

3. Block chain storage structure of string

For the chain storage structure of strings, it is similar to the linear table, but due to the particularity of the string structure, each element data in the structure is a character. If you simply use the linked list to store string values, one node corresponds to one character, then There will be a lot of waste of space. Therefore, you can store one character in one node, or consider storing multiple characters. If the last node is not full, you can use "#" or other non-string value characters to complete it.

#define CHUNKSIZE 80 //定义块的大小
typedef struct chunk{
    char ch[CHUNKSIZE];
    struct chunk *next;
}chunk;

typedef struct{
    chunk *head,*tail;//串的头指针和尾指针
    int curlen;    //串的当前长度
}LString;

Under normal circumstances, when operating on a string, you only need to scan sequentially from beginning to end, and it is not necessary to establish a doubly linked list for string values.

The purpose of setting the tail pointer is to facilitate the concatenation operation, but it should be noted that the invalid characters at the end of the first string need to be processed when concatenating. 

Storage density:

 

Guess you like

Origin blog.csdn.net/Jacky_Feng/article/details/108601074