【Data structure】Storage structure and basic operations of strings (fixed-length sequential strings, heap strings, block chain strings) (C language)

1. The basic concept of string

String (String) is a finite sequence of zero or more characters. Recorded as S = 'a1a2...an' (n>0)
where S is the name of the string, the sequence of characters enclosed in single quotes is the value of the string, and each ai (1<=i<=n) can be a letter , numbers or other characters. n is the number of characters in the string, which is called the length of the string, and the string when n=0 is called an empty string.

Substring : A subsequence composed of any consecutive characters in a string is called a substring of the string.
Main string : The string containing substrings is called the main string.
The position of the substring in the main string : usually the serial number of a character in the string is called the position of the character in the string. The position of the substring in the main string is represented by the position of the first character of the substring in the main string.
String equality : Two strings are said to be equal if and only if their values ​​are equal, that is, only when the lengths of the two strings are equal and the characters in each corresponding position are equal.

A string is a specific linear table. The logical structure of a string is very similar to a linear table. Its specificity lies only in that the data object of a string is limited to a character set. Common implementation methods include fixed-length sequential strings, heap strings, and block chain strings.

String pattern matching (simple pattern matching algorithm, KMP algorithm) see: https://blog.csdn.net/weixin_51450101/article/details/122684649

2. Fixed-length sequential string

The fixed-length sequence string is designed as a static structure type, using a group of storage units with continuous addresses to store the character sequence of the string.

2.1 Code + Notes

# include<stdio.h>
# define MAXLEN 40
# define TRUE 1
# define FALSE 0

/*定长顺序串*/
/*定长顺序串的存储结构*/
typedef struct {
    
    
	char ch[MAXLEN];
	int len;			//字符串长度
}SString;

/*定长顺序串初始化*/
void StrInit(SString* S) {
    
    
	S->len = 0;
}

/*定长顺序串的创建*/
void StrCreate(SString* S) {
    
    
	int n, i;
	printf("字符串长度为:");
	scanf("%d", &n);
	printf("请输入字符串:");
	for (i = 0; i < n; i++) {
    
    
		scanf(" %c", &(S->ch[i]));
	}
	S->len = n;
}

/*插入*/
int StrInsert(SString* S, int pos, SString *t) {
    
    
//在串S中下标为pos的字符之前插入串t
	int i;
	if (pos < 0 || pos > S->len)				//插入位置不合法
		return FALSE;
	if (S->len + t->len <= MAXLEN) {
    
    			//插入后串长<=MAXLEN
		for (i = S->len + t->len - 1; i >= t->len + pos; i--)
			S->ch[i] = S->ch[i - t->len];		//位置pos后的字符串后移
		for (i = 0; i < t->len; i++)
			S->ch[i + pos] = t->ch[i];			//将t串插入
		S->len = S->len + t->len;
	}
	else if (pos + t->len <= MAXLEN) {
    
    			//插入后串长大于MAXLEN,但串t的字符序列可以全部插入
		for (i = MAXLEN - 1; i > t->len + pos - 1; i--)
			S->ch[i] = S->ch[i - t->len];
		for (i = 0; i < t->len; i++)
			S->ch[i + pos] = t->ch[i];
		S->len = MAXLEN;
	}
	else {
    
    										//插入后串长大于MAXLEN,并且串t的部分字符也要舍弃
		for (i = 0; i < MAXLEN - pos; i++)
			S->ch[i + pos] = t->ch[i];
		S->len = MAXLEN;
	}
	return TRUE;
}

/*顺序串删除*/
int StrDelete(SString* S, int pos, int n) {
    
    
//在串S中删除从下标pos起n个字符
	int i;
	if (pos<0 || pos>(S->len - n))				//删除位置不合法
		return FALSE;
	for (i = pos + n; i < S->len; i++)
		S->ch[i - n] = S->ch[i];				//从pos+n开始至尾串依次向前移动,实现删除n个字符
	S->len = S->len - n;
	return TRUE;
}

/*串比较函数*/
int StrCompare(SString* S, SString* t) {
    
    
//若串S和t相等返回0;若S>t返回正数;S<t返回负数
	int i;
	for (i = 0; i < S->len && i < t->len; i++) {
    
    
		if (S->ch[i] != t->ch[i])
			return (S->ch[i] - t->ch[i]);
	}
	return (S->len - t->len);
}

/*定长顺序串的输出*/
void Display(SString* S) {
    
    
	int i;
	for (i = 0; i < S->len; i++)
		printf("%c", S->ch[i]);
	printf("\n");
}

int main() {
    
    
	SString S, t;
	int pos, n;
	printf("--------串的创建--------\n");	//创建
	StrInit(&S);
	StrCreate(&S);
	printf("创建的字符串:");
	Display(&S);

	printf("\n--------串的插入--------\n");	//插入
	StrInit(&t);
	StrCreate(&t);
	printf("要插入位置为:");
	scanf("%d", &pos);
	StrInsert(&S, pos, &t);
	printf("插入后字符串:");
	Display(&S);

	printf("\n--------串的删除--------\n");	//删除
	printf("删除的位置及字符个数:");
	scanf("%d%d", &pos, &n);
	StrDelete(&S, pos - 1, n);
	printf("删除后字符串:");
	Display(&S);

	printf("\n--------串的比较--------\n");	//比较
	if (StrCompare(&S, &t) == 0)
		printf("串S = 串t\n");
	else if(StrCompare(&S, &t) > 0)
		printf("串S > 串t\n");
	else
		printf("串S < 串t\n");
	return 0;
}

2.2 Running results

Fixed-length sequence string running result

3. Heap

A string includes two parts: a string name and a string value, and the string value is stored using a heap string storage method, and the string name is stored in a symbol table.
Heap string storage method : The characters in the string are stored sequentially in a group of storage units with continuous addresses, but their storage space is dynamically allocated during program execution. The system uses a storage space with continuous addresses and a large capacity as the available space for strings. Whenever a new string is created, the system allocates a space with the same size as the string length from this space for storing the new string. value.
String name symbol table : The storage image of all string names constitutes a symbol table. With this structure, a corresponding relationship between string names and string values ​​can be established, which is called the storage image of string names.

3.1 Code + Notes

# include<stdio.h>
# include<malloc.h>
# define TRUE 1
# define FALSE 0

/*堆串*/
/*堆串的存储结构*/
typedef struct {
    
    
	char* ch;									//ch域指示串的起始地址
	int len;
}HString;

/*初始化*/
void StrInit(HString* s) {
    
    
	s->ch = NULL;
	s->len = 0;
}

/*堆串赋值*/
int StrAssign(HString* s, char* tval) {
    
    
//将字符串常量tval的值赋给堆串s
	int len = 0, i = 0;
	if (s->ch != NULL)
		free(s->ch);
	while (tval[i] != '\0')
		i++;
	len = i;
	if (len) {
    
    
		s->ch = (char*)malloc(len);				//申请空间
		if (s->ch == NULL)
			return FALSE;
		for (i = 0; i < len; i++)				//将字符串常量tval的值赋给堆串s
			s->ch[i] = tval[i];
	}
	else
		s->ch = NULL;
	s->len = len;
	return TRUE;
}

/*堆串插入*/
int StrInsert(HString* s, int pos, HString* t) {
    
    
//在串s中下标为pos的字符之前插入串t
	int i;
	char* temp;
	if (pos < 0 || pos > s->len || s->len == 0)	//插入位置不合法
		return FALSE;
	temp = (char*)malloc(s->len + t->len);
	if (temp == NULL)
		return FALSE;
	for (i = 0; i < pos; i++)
		temp[i] = s->ch[i];						//串s前半段插入temp
	for (i = 0; i < t->len; i++)
		temp[i + pos] = t->ch[i];				//串t插入temp
	for (i = pos; i < s->len; i++)
		temp[i + t->len] = s->ch[i];			//串s后半段插入temp
	s->len += t->len;
	free(s->ch);
	s->ch = temp;								//temp赋给串s
	return TRUE;
}

/*堆串删除*/
int StrDelete(HString* s, int pos, int n) {
    
    
	int i;
	char* temp;
	if (pos < 0 || pos>s->len - n)				//删除位置不合法
		return FALSE;
	temp = (char*)malloc(s->len - n);
	for (i = 0; i < pos; i++)
		temp[i] = s->ch[i];
	for (i = pos + n; i < s->len; i++)
		temp[i - n] = s->ch[i];
	s->len -= n;
	free(s->ch);
	s->ch = temp;
	return TRUE;
}

/*堆串输出*/
void Display(HString* s) {
    
    
	int i;
	if (s->len == 0)
		printf("空串!\n");
	else {
    
    
		for (i = 0; i < s->len; i++) {
    
    
			printf("%c", s->ch[i]);
		}
		printf("\n");
	}
}

int main() {
    
    
	int pos, n;
	HString s, t;
	char tval_s[6] = {
    
     'a','b','c','d','e' };
	char tval_t[4] = {
    
     'q','w','r' };
	printf("------堆串赋值------\n");			//赋值
	StrInit(&s);
	StrAssign(&s, tval_s);						//串s赋值
	printf("串s为:");
	Display(&s);
	StrInit(&t);
	StrAssign(&t, tval_t);						//串t赋值
	printf("串t为:");
	Display(&t);

	printf("\n------堆串插入------\n");			//插入
	printf("插入位置:");
	scanf("%d", &pos);
	StrInsert(&s, pos - 1, &t);
	printf("插入后为:");
	Display(&s);

	printf("\n------堆串删除------\n");			//删除
	printf("删除位置和个数:");
	scanf("%d%d", &pos, &n);
	StrDelete(&s, pos - 1, n);
	printf("删除后为:");
	Display(&s);
	return 0;
}

3.2 Running results

heap run results

4. Block chain string

A linked list stores a string value, and each node can store either one character or multiple characters. Each node is called a block, and the entire linked list is called a block chain structure.
Storage structure of block chain string

/*块链串的存储结构*/
# define BLOCK_SIZE 4	//每结点存放字符个数为4

typedef struct Block {
    
    
	char ch[BLOCK_SIZE];
	struct Block* next;
}Block;

typedef struct {
    
    
	Block* head;
	Block* tail;
	int len;
}BLString;

The nodes in the linked list are divided into two domains, data and link, where the node size refers to the number of characters stored in the data domain, and the size of the link domain refers to the number of characters occupied in the link domain.
Storage density = storage bits occupied by string values/storage bits actually allocated for strings. The smaller the storage density of the string, the more convenient the calculation and processing, but the larger the storage space occupied.

Reference: Geng Guohua "Data Structure - Described in C Language (Second Edition)"

For more data structure content, follow my "Data Structure" column : https://blog.csdn.net/weixin_51450101/category_11514538.html?spm=1001.2014.3001.5482

Guess you like

Origin blog.csdn.net/weixin_51450101/article/details/122668121