Data Structure Notes - String

Looking at the distant mountains and waters with dry eyes, how many times have you
seen and known?
Hu Kong is afraid to drink a glass of wine, and it
is difficult to write harmonious rhymes.
The road hinders people from leaving for a long time, and the
news is sent back late without any geese.
The lonely night watchman is lonely,
the husband remembers his wife and his father remembers his children.

This is a poem written by Li Yu in the Song Dynasty to miss his wife and son. Suddenly, he found that this poem can be read upside down. . . This kind of poetry is called palindrome. There is also magic in English words:

"Even a lover has an over, even a friend has an end, and even a believe has a lie." You will find two words that are originally irrelevant or opposite, but have some kind of magical connection.

Today we are talking about problems related to words or sentences forming strings.

string definition

String: A finite sequence of zero or more characters, also known as a string.

Generally recorded as s = "a 1 a 2 a 3 ... a n "(n>=0), where s is the name of the string, the character sequence enclosed in double quotes is the value of the string, note that the quotes do not belong to the string content . a i can be a letter, number or other character, i is the position of the character in the string. The number of characters n in the string is called the length of the string, and the finiteness mentioned in the definition means that the length n is a finite value. A string of zero characters is called an empty string, and its length is 0, which can be directly represented by two double quotation marks.

There are a few more concepts to be aware of:

  • A space string is a string that only contains spaces. Pay attention to the difference between it and an empty string. A space string has content and length, and there is more than one space in the dead eye.
  • Substring and main string, the subsequence composed of any number of consecutive characters in the string is called the substring of the string, and correspondingly, the string containing the substring is called the main string.
  • The position of a substring in the main string is the position of the first character of the substring in the main string.

String comparison

Comparing two numbers, 2 is greater than 1, which is completely correct, but how do two strings compare?

In fact, the comparison of strings is carried out by the encoding between the characters that make up the string, and the encoding of a character refers to the ordinal number of the character in the corresponding character set.

Commonly used characters in computers are encoded in standard ASCII. More precisely, one character is represented by an 8-bit binary number, and a total of 256 characters can be represented. These are only enough for English-based languages ​​and special symbols, but there are thousands of words in the world. Wan Wan, obviously this is not enough, so the Unicode encoding is proposed, which uses a 16-bit binary number to represent a character, which can represent a total of more than 65,000 characters, and the first 256 characters are exactly the same as ASCII.

If we compare whether two strings are equal in C language, they must be equal in length of their strings and the characters in their corresponding positions.

So how do you compare the size of two strings when they don't want to wait?

Given two strings: s = "a 1 a 2 a 3 ... a n ", t = "b 1 b 2 b 3 ... b m ", s < t when one of the following conditions is met:

  1. n < m,且ai < bi(1<=i<=n);
  2. There exists a certain k <= min(m,n) such that a i =b i , (i from 1 to k-1), a k < b k ;

In other words, when two strings are equal and the characters in the corresponding positions are equal, then the two strings are equal.

String storage structure

1. Sequential storage structure

The sequential storage structure of a string uses a group of storage units with consecutive addresses to store the sequence of characters in the string. Allocate a fixed-length memory area for each defined string variable of a predefined size. It is generally defined by a constant array.

Since it is a fixed-length array, there is a predefined maximum string length. Generally, the actual string length value can be stored in the 0 subscript position of the array, and some languages ​​add it at the end of the array:

write picture description here

The sequential storage of strings mentioned above is actually problematic, because string operations, such as the connection of two strings, the insertion of new strings, etc., may cause the length of the string to exceed the length of the array.

So for sequential storage, there are some optimizations, and the storage space for string values ​​can be dynamically allocated during program execution.

2. Chain storage structure

For the chained storage structure of strings, it is similar to the linear table, but due to the particularity of the transmission structure, each element data in the structure is a character. If the linked list is also simply used to store the string value, one node corresponds to one character. , there is a huge waste of memory. Therefore, a node can store one character, or consider storing multiple characters. If the last node is not full, it can be filled with a pound sign or other value.

write picture description here

In general, it is not as flexible as sequential storage, and its performance is not as good as sequential storage structure.

Naive pattern matching algorithm

Substring positioning operations, often referred to as string pattern matching

The usual pattern matching is to use each character of the main string as the beginning of the substring to match the string to be matched. If it does not match, the whole is shifted back one bit until it is completely matched.

The time complexity is O(n+m), where n is the length of the main string and m is the length of the substring.

For more exciting content, please pay attention to my WeChat public account - Android Motor Vehicles
write picture description here

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325362231&siteId=291194637