【Data Structure Learning Record 10】——String

1. Brief introduction

1 Introduction

The book says: "When dealing with string data, it is much more complicated than dealing with integers or floating-point numbers, and in different types of applications, the strings used have different characteristics. Therefore, it is necessary to implement string data effectively. To deal with it, you must use an appropriate storage structure according to the specific situation." I think this is the meaning of this chapter.

2. Definition of nature

Is a finite sequence consisting of zero or multiple characters, generally recorded as:
s = a 0 a 1 a 2 ···a n (n>=0)
sis the name of the string; the character sequence enclosed in single quotes is The value of the string; aIt can be a letter or other characters. The number of characters in a string is ncalled the length of the string. A string of zero characters is called 空串.
In a string, a sequence of words composed of any number of consecutive characters is called the string 子串; a string containing substrings is called accordingly 主串; usually the sequence number of the character in the sequence is called the string in the string 位置. For
example:
s = 'kanna'
b ='anna'
c ='kan'
Then the strings b and c are substrings of a, and the lengths of s, b, and c are: 5, 4, and 3 respectively. The position of b in s is 2, and the position of c in s is 1.

Two. Realize

1. Brief description string

In addition to our basic 增删改查functions, we have to conceive several functions: the length of the string; return a string with a specified position and a specified length; search for a string...

2. The structure of the string

There is another structure in the book that uses an array to represent a string, but it is 堆分配(malloc)similar to ours, so we focus on the latter.
For string structure, we use two parts: 用于储存文本的基址区 长度.
Insert picture description here

3. Initialization of the string

We can initialize a string with a C language style string array or string literal value. The C language style string has a notable feature, that is, there will be one at the end of the \0string to mark the end of the string. Then use convenience to get the length of the string, and then assign one by one to initialize the spoof string.

4. "Addition" of strings

Of course, adding string is equivalent to keeping string 1 unchanged, splicing the content of string 2 behind string 1, and adding the length of string 1 to the length of string 2.
Insert picture description here

5. String comparison

We first compare the lengths of the two strings. If the lengths are different, then the two strings must be different. If the length of the two strings is the same, then start to compare the first character, the second character... until the first character that is not the same, then it is not equal, if there is no different character at the end , It means that the two strings are the same.

6. Destruction of string

Because our strings are dynamically generated, we can just freedrop them directly .

7. Return string

We still move ours 指针, and then return a string assigned sequentially.

8. Find a string

There are many ways to find a string. Here is a n*msearch algorithm with a time complexity level. As for optimization, it is the content of the next section. Our basic algorithm here is as follows:
There are two pointers, the selection pointer a and the matching pointer b.
We select the pointer to move one position first, and then the matching pointer starts to move. If the matching pointer moves the length of the string and they are all the same, then the position of the selected pointer a is the position where the string starts. If the matching pointer moves to a position before the length of the string, it means that the matching fails, select the pointer a++, and b = a, and move the b pointer again.
Insert picture description here

Three. Code

#include <stdio.h>
#include <stdlib.h>

#define     OK          1
#define     ERROR       0

typedef struct sstring{
    
    
    char *ch;
    int len;
}sstring;

sstring* StringInit(char* str);
int StringShow(sstring *str);
int StringConcat(sstring* str1, sstring* str2);
int StringCompare(sstring *str1, sstring *str2);
sstring *StringGet(sstring *str, int index, int len);
int StringFind(sstring *bstr, sstring *mstr);

int main()
{
    
    
    // 主函数随便改
    sstring *test1 = StringInit("bbcd");
    sstring *test2 = StringInit("bcd");
    //StringConcat(test1, test2);
    //printf("%d", StringCompare(test1, test2));
    //StringShow(StringGet(test2,0,3));
    printf("%d", StringFind(test1,test2));
    return 0;
}

sstring* StringInit(char* ss)
{
    
    
    int lenth = 0;
    //首先创建要返回的
    sstring *str = (sstring*)malloc(sizeof(sstring));
    
    //动态生成失败,直接退出
    if (str == NULL) exit(1);
    //如果传入的是空字符串,我们就返回一个空的字符串
    if (ss == NULL)
    {
    
    
        str->ch = NULL;
        str->len = 0;
        return str;
    }
    // 通过依次遍历,获得传入字符串中,非/0部分长度。
    while(*(ss + lenth) != '\0')
    {
    
    
        ++lenth;
    }
    // 修改我们字符串的长度和动态分配它的储存空间
    str->len = lenth;
    str->ch = (char*)malloc(sizeof(char)*lenth);
    --lenth;
    // 通过遍历,将C语言字符串的内容,复制到我们的新字符串中
    while(lenth >= 0)
    {
    
    
        *(str->ch+lenth) = *(ss+lenth);
        --lenth;
    }

    return str;
}

int StringShow(sstring *str)
{
    
    
    int ptr = 0;
    printf("the string len is %d context is: ", str->len);
    while(ptr < str->len)
    {
    
    
        printf("%c", *(str->ch + ptr));
        ++ptr;
    }
    printf("\n");
    return OK;
}

int StringConcat(sstring* str1, sstring* str2)
{
    
    
    sstring* stringNew = NULL;
    int ptr = 0;
    // 如果两个串的长度都是0,那就直接返回即可
    if (str1->len + str2->len == 0) 
    {
    
    
        return OK;
    }
    // 否则就先生成我们的新串,修改长度与内容
    stringNew = (sstring*)malloc(sizeof(sstring));
    stringNew->ch = (char*)malloc(sizeof(char)*(str1->len+str2->len));
    stringNew->len = str1->len+str2->len;
    // 通过循环,将str1的值写入新串
    for(;ptr < str1->len; ++ptr)
    {
    
    
        *(stringNew->ch+ptr) = *(str1->ch+ptr);
    
    }
    // 在str1写入新串的基础上,向新串写入str2
    for(ptr = 0;ptr < str2->len; ++ptr)
    {
    
    
        *(stringNew->ch+ptr+str1->len) = *(str2->ch+ptr);
    }

    // 然后这里优点坑,因为传递过来的指针是形参,并不是引用
    // 所以 我们只能把新串的值赋值给原来的串
    // 此时,传入函数字符串的地址没变,但是len变了, ch的地址变了
    *str1 = *stringNew;
    return OK;
}

int StringCompare(sstring *str1, sstring *str2)
{
    
    
    int i = 0;

    // 长度都不一样,所以通过长度,反应关系
    if (str1->len > str2->len)
    {
    
    
        return 1;
    }
    else if (str1->len < str2->len)
    {
    
    
        return -1;
    }
    else
    {
    
    
        // 长度一样了,只有依次对比了
        for (; i < str1->len; ++i)
        {
    
    
            // 只要有一个字符不一样,那就根据ascii的关系去返回大小关系
            if (*(str1->ch+i) < *(str2->ch+i))
            {
    
    
                return -1;
            }
            else if (*(str1->ch+i) > *(str2->ch+i))
            {
    
    
                return 1;
            }
        }
        // 循环完了也没有找到不同,所以它俩是一样的
        return 0;
    }
}

sstring *StringGet(sstring *str, int index, int len)
{
    
    
    sstring *rstr = NULL;
    int i = 0;

    // 如果目标串的长度小于我们要求的长度,所以直接返回空的
    if (str->len < index+len)
    {
    
    
        return NULL;
    }
    else
    {
    
    
        // 动态生成我们的返回串
        rstr = (sstring *)malloc(sizeof(sstring));
        rstr->ch = (char *)malloc(sizeof(char)*str->len);
        rstr->len = len;
        // 然后把目标串里的值复制到我们的返回串里
        for (i = 0; i < len; ++i)
        {
    
    
            *(rstr->ch+i) = *(str->ch+index+i);
        }
        return rstr;
    }
    
}

int StringFind(sstring *bstr, sstring *mstr)
{
    
    
    int fptr = 0, lptr = 0;
    int mark = 0;

    // 如果我们要查找的串的长度大于了目标串,那肯定找不到的,直接返回-1
    if (bstr->len < mstr->len)
    {
    
    
        return -1;
    }
    // lptr是指向 我们目标串的开始指针
    // 它只需要从0遍历到(目标串长度-要查找的串的长度)就行了
    for (;lptr <= (bstr->len-mstr->len); ++lptr)
    {
    
    
        // mark是标记位,如果有不同,那就是1 没有不同就还是0
        mark = 0;
        // 这个是查找指针,我们要对比的内容因该是lptr+fptr
        // 它的范围是 0到查找串的长度-1
        for (fptr = 0; fptr < mstr->len; ++fptr)
        {
    
    
            // 对比的内容是 lptr+fptr
            if (*(bstr->ch+lptr+fptr) != *(mstr->ch+fptr))
            {
    
    
                // 有不同,更新标识,并跳出这一轮 fptr的遍历
                mark = 1;
                break;
            } 
        }
        // fptr遍历完了,都还没有不同的,说明找到了
        if (mark == 0)
        {
    
    
            // 那么就因该返回我们lptr的起始位置
            return lptr;
        }
    }
    // 查遍了整个串都没找到,那就只能返回 -1了
    return -1;
}

Guess you like

Origin blog.csdn.net/u011017694/article/details/109531835