Overview of String Functions and Memory Functions

My ability is limited, it is inevitable that there will be errors or lack of details in the description! I hope readers can give feedback on mistakes and places that are not good enough when reading! Grateful!

Table of contents

Find the length of a string:  strlen

Unlimited-length string functions:  strcpy , strcat , strcmp

 Introducing length-restricted string functions: strncpy  , strncat ,  strncmp

 String lookup:  strstr ,  strtok

 error message report  strerror

Character classification function:

Character conversion:

Memory manipulation functions memcpy ,  memmove , memset ,  memcmp

Simulate the implementation of string functions and memory functions

Analog implementation of strlen

Simulate implementation of strcpy

 Simulate the implementation of strcat

Simulate strstr

Simulate implementation of strcmp

Analog implementation of memcpy

Simulation implementation of memmove


In C language, it is inevitable to deal with strings, but C language itself does not have a string type, so in order to better apply strings, understanding the string functions in C language can help solve some problems.

Find the length of a string:

 strlen

size_t strlen ( const char * str );

 The string has '\0' as the end mark, and the strlen function returns the number of characters that appear before '\0' in the string (not including '\0'
).

The string pointed to by the parameter must end with '\0'.
Note that the return value is size_t, which is unsigned (error-prone)


So where is this size-t sacred?

We go to its internal definition, which can be found as follows:

Therefore, size_t is actually an unsigned integer after redefinition

 The usage code is as follows, including an error demonstration, that is, the problem that stlen will return a random value when there is no "\0" inside the string.

int main()
{
    char arr[] = "abcdef";   //实际内存中的存储:abcdef\0
    char arr1[] = { 'W', 'T', 'F' };
    //[][][][][][W][T][F][][][][][][][]
	int len = strlen(arr1);//随机值

	printf("%d\n", len);

	return 0;
}

 Please note! Since strlen returns an unsigned integer, when we want to use srtlen to help us calculate the difference between the lengths of two strings, if the return value is a negative number, then a wrong value will be generated, please use Always keep in mind and pay attention.

String functions of unlimited length:

 strcpy

char* strcpy(char * destination, const char * source );

The function of the strcpy function is to directly copy the contents of a string to the space of the target string

 Use the code as follows

int main ()
{
    char name[20]={0};
    strcpy (name,"Helen");
    return 0;
}

Please note that the copied content is before "\0", that is, if a "\0" is added to "Helen" to become "Hel\0en", then only the previous Hel will be copied.

Correspondingly, if there is no "\0" in the copied content, then there will be problems with the copied characters, and strcpy cannot know where to stop, and an error will occur. Therefore, the following conditions need to be met when using strcpy.

The source string must be terminated with '\0'.
Will copy '\0' in the source string to the target space.
The destination space must be large enough to accommodate the source string.
target space must be mutable

 

screwed up

char * strcat ( char * destination, const char * source );

Appends the contents of a string to the target string.

 Use the code as follows:

int main()
{
	char arr[20] = { "APEX " };
	char arr1[20] = { "Legend" };

	printf("%d", strcmp(arr, arr1));
    return 0;

}

The strcpy function returns the starting address of the target space

The setting of the return type of the strcpy function is to realize chain access.

The source string must be terminated with '\0'. Append from \0 position

The destination space must be large enough to hold the contents of the source string.

The target space must be modifiable.

strcmp

int strcmp ( const char * str1, const char * str2 );

 Compares two strings and returns an integer value

If the first string is greater than the second string, return a number greater than 0 If
the first string is equal to the second string, return 0
If the first string is less than the second string, return a number less than 0 number

 Introduction to length-restricted string functions:

The strcpy mentioned in the previous article copies the string, but in fact it has some dangers. When the length of the copied string is greater than the size of the target space, strcpy will directly stuff it back into it no matter what. Go, plug it in or plug it in, this will cause some problems, so the C language comes with a safer and more controllable string function, which is similar to the one introduced above, but with these library functions you can specify each time The size of the string being copied or moved.

strncpy

char * strncpy ( char * destination, const char * source, size_t num );

Copies num characters from source string to destination space.
If the length of the source string is less than num, after copying the source string, add 0 to the end of the target until num.

The method of use and the effect of use are as follows:

int main()
{
	char arr[20] = { "APEX " };
	char arr1[20] = { "Legend" };

	printf("%s", strncpy(arr, arr1,1));//拷贝一个字节

}


When the size of the copied string cannot meet the set length, the function will fill in 0 after copying, and the effect is as follows:    

 strncat

char * strncat ( char * destination, const char * source, size_t num );

  Append num characters from source string to destination space.

The usage method and effect are as follows:

int main ()
{
char str1[20];
char str2[20];
strcpy (str1,"To be ");
strcpy (str2,"or not to be");
strncat (str1, str2, 6);
puts (str1);
return 0;
}

 strncmp

int strncmp ( const char * str1, const char * str2, size_t num );

Compare until another character is different or a string ends or all num characters are compared.
We can use this function to try to find those strings that only have the same first two characters, so that the conditions we are looking for become controllable

int main ()
{
char str[][5] = { "R2D2" , "C3PO" , "R2A6" };
int n;
puts ("Looking for R2 astromech droids...");
for (n=0 ; n<3 ; n++)
if (strncmp (str[n],"R2xx",2) == 0)
{
printf ("found %s\n",str[n]);
}
return 0;
}

 String lookup:

 strstr

char * strstr ( const char *str1, const char * str2);

 Determine whether str2 is a substring of str1, if str2 appears in str1, return the address of the first occurrence in str1

If not found, returns a null pointer

Examples of use are as follows:

int main()
{
	char str[] = "This is a simple string";
	char* pch;
	pch = strstr(str, "simple");//用strstr找到当前句子中的字串,返回其地址
	if (pch == NULL)//如果没有找到,会返回空指针,空指针非常危险,我们需要栓个保险
	{
        printf("找不到");
		return 1;
	}
	strncpy(pch, "sample", 6);//用strcpy将目标的内容替换掉
	
	puts(str);
	return 0;
}

 strtok

char * strtok ( char * str, const char * sep );

 Finds the token character specified in the target string and returns it

The sep parameter is a string that defines the set of characters used as separators. The
first parameter specifies a string that contains 0 or more tokens separated by one or more separators in the sep string
.
The strtok function finds the next token in str, terminates it with \0, and returns a pointer to this token. (Note:
The strtok function will change the string being manipulated, so the strings split by the strtok function are generally temporary copied content
and can be modified.)

The first parameter of the strtok function is not NULL, and the function will find str The first token in the string, the strtok function will save its
position in the string.
The first parameter of the strtok function is NULL, the function will start at the saved position in the same string, and search for the next token
.
Returns a NULL pointer if there are no more tokens in the string.

int main()
{
	char* p = "[email protected] tonight";
	const char* sep = ".@";
	char arr[30];
	char* str = NULL;
	strcpy(arr, p);//将数据拷贝一份,处理arr数组的内容
	for (str = strtok(arr, sep); str != NULL; str = strtok(NULL, sep))
		//这里妙用了一下for循环的性质,初始化状态为寻找到第一个标记,循环条件为返回的指针不为空
		//每次循环更新时,再寻找下一个标记,我们可以从结果看到,sep里面标记的顺序不会影响strtok
		//寻找目标字符串的顺序
	{
		printf("%s\n", str);
	}
}

 

 error message report

 strerror

#include <errno.h>//必须包含的头文件
char * strerror ( int errnum );

Return the error code and the corresponding error message.

When the library function of C language fails to execute, an error code will be set.

 Styles of various error codes

We don't need to remember these error codes, because the C language will return an error code when an error occurs, but we may wonder, how can I call this error code?

So the C language comes with a global variable errno dedicated to storing error codes.

As long as an error occurs in the C language, the error code will be placed in this variable

So we only need to execute the following code when an error occurs to know what error occurred:

printf("%s",strerror(errno));

Character classification function:

function Returns true if its argument meets the following conditions
iscntrl any control characters
isspace Whitespace characters: space ' ', form feed '\f', line feed '\n', carriage return '\r', tab '\t' or vertical tab '\v'
even Decimal number 0~9
self digit Hexadecimal numbers, including all decimal numbers, lowercase letters a~f, uppercase letters A~F
islower lowercase letters a~z
isupper Capital letters A~Z
isalpha Letter a~z or A~Z
the ice hall Letter or number, a~z,A~Z,0~9
ispunct Punctuation, any graphic character that is not a number or letter (printable)
isgraph any graphic character
sprint Any printable character, including graphic characters and white space characters

Character conversion:

int tolower ( int c );//大写——>小写
int toupper ( int c );//小写——>大写
int main ()
{
    int i=0;
    char str[]="Test String.\n";
    char c;
    while (str[i])
    {
        c=str[i];
        if (isupper(c))
       {
        c=tolower(c);
       }
        putchar (c);
        i++;
    }
return 0;
}

memory manipulation function

 C language actually has some useful functions, which can operate on data in memory as quickly as string functions.

memcpy

void * memcpy ( void * destination, const void * source, size_t num );

 The function memcpy copies num bytes of data backwards from the location of source to the memory location of destination.
The function does not stop when it encounters '\0'.

If there is any overlap between source and destination, the result of copying is undefined . In human terms, if you want to use memcpy to copy yourself, it will not work
. Since memcpy can operate on memory, the structure is also completely possible. of

#include <stdio.h>
#include <string.h>

struct
{
    char name[40];
    int age;
} person, person_copy;

int main ()
{
    char myname[] = "Pierre de Fermat";
     /* using memcpy to copy string: */
     memcpy ( person.name, myname, strlen(myname)+1 );
     person.age = 46;
    /* using memcpy to copy structure: */
     memcpy ( &person_copy, &person, sizeof(person) );
     printf ("person_copy: %s, %d \n", person_copy.name, person_copy.age );
return 0;
}

 memmove

void * memmove ( void * destination, const void * source, size_t num );

The difference with memcpy is that the source memory block and target memory block processed by the memmove function can overlap.
If the source space and the target space overlap, you have to use the memmove function to deal with it.

That is to say, memmove can achieve the effect of transferring itself.

memset

 Memory settings in bytes

void *memset( void *dest, int c, size_t count );//目标空间,需要设置的字符,所需要设置的字节数

 Note that the number of bytes is set. If there is a need for usage, please also pay attention to the size of the modified content and the impact.

 memcmp

 Compare num bytes starting from ptr1 and ptr2 pointers

int memcmp ( const void * ptr1,const void * ptr2,size_t num );

The return value is as follows:

 When the return value is less than 0: the value of one bit in the two memory spaces is not equal to the other, and the value of the bit in space 1 is smaller than that in space 2

When the return value is equal to 0: the values ​​of all bits in the two memory spaces are equal everywhere in the other space

When the return value is greater than 0: the value of one bit in the two memory spaces is not equal to the other, and the value of the bit in space 1 is greater than that in space 2

A value greater than 0 is returned here, and the ASCll code value of a lowercase character is greater than that of an uppercase character, so a value greater than 0 is returned.

Simulate the implementation of string functions and memory functions

 After understanding these functions, it is still necessary to understand their working principles, otherwise it is easy to know why they do not know why, and a deep understanding of the logic principles of these functions is also helpful for understanding, so we can try to simulate and implement some of the more commonly used ones function.

Analog implementation of strlen

 strlen is still a commonly used function, so let's try it first. If the main logic is to solve it violently, it is a counter. When it counts to "\0", it stops and returns the count value. This is the first method:

//非递归实现strlen
int main()
{   
    int n = 0;
    int count  = 0;
    char arr[] = "OMG";
    while (arr[n] != '\0')
    {
        count++;
        n +=1;
    }
    printf("%d ", count);

    return 0;
}

The second method is recursive:

//递归和非递归分别实现strlen
int  Fun(char* n)
{
    int count = 0;
    if (*n !='\0')
    {
        count++;
        return Fun(n+1)+1;
    }
    else
        return count;

}
int main()
{   
    int a = 0;
    char arr[] = "OMG";
    int ret = Fun(arr);
    printf("%d ", ret);

    return 0;
}

The third method is simple and easy to understand, using pointers to count:

int mystrlen(const char* arr)
{
	int count = 0;
	while (*arr++)
	{
		count++;
	}
	return count;
}


int main()
{
	char arr[50] = "XXXXX";

	scanf("%s", arr);

	printf("%d ", mystrlen(arr));
	
	return 0;

}

There is also an optimized version of the pointer method, as follows:

//指针-指针的方式
int my_strlen(char *s)
{
char *p = s;
while(*p != ‘\0’ )
p++;
return p-s;
}

Because the pointer-pointer can get the number of elements between the two pointers, we let the pointer go to "\0", and then subtract the pointer pointing to the first address to get the length of the string.

Simulate implementation of strcpy

 Parsing has been written into the code segment:

char* my_strcpy(char* des,const char* res)//利用const保证res和不会因为位置对调而报错
										  //原理是const修饰的变量不能被改变,当赋值顺序改变的时候,触发const机制
										  //即*res ++ = des ++ 的之后报错,扼杀可能报错的来源
{
	assert(res != NULL);//断言,放置啥都不输入就执行函数
	assert(des != NULL);

	char* ret = des;
	while (*des++ = *res++ )
	{
		;
	}
	return ret;//为什么返回一个指针变量呢?方便链式访问,数组名相当于一个指针变量,返回char*就相当于返回了整个数组
}

int main()
{
	char des[] = "XXXXXXXXXXX";
	char res[] = "copy down";

	//my_strcpy(des, res);

	printf("%s\n", my_strcpy(des, res));


	return 0;
}

 Simulate the implementation of strcat

 It is not complicated to implement strcat, we only need to get the address at the end of the target string and then connect it.

char* mystrcat(char* dest, const char* sour)//在另一个字符串后面加字符串
{
	char* ret = dest;
	char* end = 0;
	assert((dest && sour )!= NULL);

	while (*dest)
	{	
		dest++;	
	}
	while (*dest++ = *sour++)
	{
		;
	}


	return ret;
}

Simulate strstr

 strstr has a tricky problem, what should I do when only the first part of the target string is the string I am looking for? When you encounter this problem when the two pointers move together, you need to go back to the original situation and traverse down.

At this time, we need a few more pointers to solve this problem. When there is no search, no value is returned, and the pointer of arr2 is reassigned back to arr1, and arr1 traverses backwards. The pointer of arr2 is together with arr1 every time a character is +1 Compare backwards until the pointer of arr2 successfully touches "\0".

If the characters of arr1 and arr2 are not equal before the pointer of arr2 touches "\0", then the pointer of arr1 returns to the starting position and +1, and the pointer of arr2 also returns to the starting point.

char* mystrstr(char* arr1, const char* arr2)
{
	assert(arr1 && arr2);
	char* mark1;
	char* mark2 = arr2;

	while (*arr1)
	{
		
		mark1 = arr1;
		arr2 = mark2;
		while (*arr1 != '\0' && *arr2 != '\0' && *arr1 == *arr2)
		{
			arr1++;
			arr2++;
		}
		if (*arr2 == '\0')
		{
			return mark1;
		}
		arr1++;
	}
	return NULL;
}

Simulate implementation of strcmp

int mystrcmp(const char* arr1, const char* arr2)
{
	assert(arr1 && arr2);

	while (*arr1 == *arr2)
	{
		if (*arr1 == '0')
			return 0;
		*arr1++;
		*arr2++;
	}

	return (*arr1 - *arr2);
}

Analog implementation of memcpy

 The next step is the implementation of the memory function, and the simulation implementation of the memory function is somewhat different. At this time, we need to process the data in the memory, and it is impossible for us to know the data type stored in the memory, so we need to use void* to receive From the variables in the memory, and then another problem comes, then I don't know what the data type in the memory is, how can I control the change of the data in the memory?

We know that for an int* type pointer variable, each +1 accesses 4 bytes at a time, while a char* type pointer variable only accesses 1 byte each time +1, then we only need to transfer each If the incoming data is cut into small pieces and processed one by one, can the data in the memory be changed? Changing and moving byte by byte directly solves our problem, so it is easy to implement.

void * memcpy ( void * dst, const void * src, size_t count)
{
void * ret = dst;
assert(dst);
assert(src);

while (count--)
 {
*(char *)dst = *(char *)src;//将指针类型强制转化成char*,每次只访问一个字节
dst = (char *)dst + 1;
src = (char *)src + 1;
}
return(ret);
}

Simulation implementation of memmove

 Memmove has a point that needs attention. Memmove can modify itself, but there are some cases where problems will occur, such as the following picture:

The picture comes from the blogger: @北方注册气

 Therefore, direct forward copying is not acceptable, and reverse copying can solve our problem, but the question is when do we use forward copying and when do we use reverse copying?

 

 Summary: When the address to be copied dest > src address, copy from back to front; when dest < src, copy from front to back

void* mymemmove(void * dst,const void *src, size_t length)
{
	void* ret = dst;
	assert(dst && src );

	if (dst > src)
	{
		while (length--)
		{
			*((char*)dst + length) = *((char*)src + length);
            //加上lenth以达到尾部

		}
	}
	else//从前向后拷贝
	{
		while (length--)
		{
			*(char*)dst = *(char*)src;
			dst = (char*)dst + 1;
			src = (char*)src + 1;

		}
	}
	return ret;
}

So far, the overview is over, I hope it will be of some help to you!

Guess you like

Origin blog.csdn.net/m0_53607711/article/details/125944290