C language basics - how the scanf function works

1. Introduction, basic concepts and terminology

In C language, input is mainly done by the standard input function, that is, the scanf function. To correctly call the scanf function to complete input, you need to understand how scanf works. In order to explain the principle clearly, let me first lay the groundwork and introduce a few concepts.

(1) Input stream: It is a series of characters input from the input device in the input buffer area. On our personal computers, the input device is the keyboard, and the bunch of things you type on the keyboard are the input stream. In question writing websites such as PTA, files are actually used to simulate the input stream. The input samples in the questions and the input in the test points are also input streams.

(2) Matching string: The matching string is the first parameter passed in when the scanf function is called. That is, those characters in double quotes. We divide the characters that can appear in the matching string into three categories: 1. Format conversion specifiers, referred to as format characters; 2. White space characters, including spaces, newlines (\n), and tabs (\t); 3. Others Characters are characters other than format characters and whitespace characters.

(3) Address list: It is the second and subsequent parameters passed in when the scanf function is called. The reason why it is called a list is that we look at it as a whole. It contains multiple addresses separated by commas.

We use the following examples to familiarize ourselves with these concepts.

enter:

1,2,3

Code:

#include <stdio.h>
int main () {
	int a,b,c;
	scanf("%d,%d,%d", &a, &b, &c);
	return 0;
}

All the characters in the input box in the example constitute the input stream, which is: '1', ',', '2', ' ;,', '3', a total of 5 characters.

The "%d,%d,%d" in the brackets of scanf in the code box are the matching strings, which are in order: '%d', ',', 39;%d',',','%d'. Among them, '%d' is the format character, and ',' are other characters.

The &a, &b, &c inside the brackets of scanf in the code box are the address list.

2. How the scanf function works

When the scanf function works, it mainly completes input through two scan probes and a work buffer. These two scanning probes work on the matching string and the input stream respectively, and are called matching probes and input probes respectively. The specific working steps are as follows:

0. Initially, the matching probe and input probe are located at the first character position of the matching string and input stream respectively. The working buffer is empty.

1. The matching probe reads the first matching character; then, depending on the type of character read by the matching probe, the following operations are different:

2-1. If the matching character read is a format character, such as %d, then the input probe and input buffer area start to work. The input probe scans a character at the current position and stores it in the input buffer area. Then the input buffer will determine whether the existing characters match the current format character. The meaning of matching here is whether these characters can be converted into legal values ​​corresponding to the format characters. If there is a match, the input probe moves to the next character and continues scanning. If there is no match, it means that the characters currently scanned and stored must be illegal. Then delete this character from the input buffer area. Then the existing legal characters are converted into values ​​corresponding to the format characters, stored in the corresponding memory addresses in the address list, and the input buffer is cleared. Finally, move the matching probe back one position (if there is one) and repeat step 1. If for a certain matching character, the input buffer is empty after removing illegal characters, or the first character stored is illegal, then end scanf directly and return.

2-2. If the matched character read is a blank character, move the matching probe back one bit and repeat step 1.

2-3. If the matching characters read are other characters, the input probe starts to work. The input probe scans a character from the current position and compares it with the current matching character to determine whether it matches. Match here means equal. If there is a match, move the input probe back by one position, move the matching probe back by one position, and repeat step 1. If there is no match, end scanning, end scanf, and return.

3. If the matching probe cannot continue to move back one bit, end scanf and return.

One thing needs to be emphasized here,If the value corresponding to the format character scanned by the matching probe is a number, such as %d, %f, %lf, etc., then enter the probe and work cache The area will automatically filter out all whitespace characters before the legal characters appear. We will illustrate this with an example later.

As can be seen from the above explanation, the working process of scanf is still a little complicated, especially when matching the format characters scanned by the probe, how to determine whether the working buffer area matches or not still needs to be carefully considered. Let's first give some examples to help you understand the entire working step of scanf. Then more examples are given to enhance the understanding of the processing when matching the format characters scanned by the probe.

Example 1

enter:

3.1416

Code:

#include <stdio.h>
int main () {
	int a=0;
	float b=0;
	scanf("%d%f", &a, &b);
	printf("%d %f\n", a, b);
	return 0;
}

Output:

3 0.141600

For the above example, the working steps of scanf are as follows:

(0) During initialization, the matching probe position is at '%d'; the input probe is at '3' of "3.1416", and the working buffer is empty.

(1) The matching probe scans to %d, which is a format character.

(2) Input the probe to scan "3", and the working buffer area stores "3". The buffer area determines that the currently stored 3 can be legally converted into the value corresponding to %d, which is a decimal integer, so enter The probe continues scanning. The input probe scans to ".", and the working buffer stores "3.". Obviously "3." cannot be legally converted to an integer. The working cache deletes the "." in "3.", converts "3" into a decimal integer, and stores it in variable a. Move the matching probe back one position and continue with step 1.

(1') The matching probe scans to %f, which is a format character.

(2') Input the probe to scan ".", the working buffer area stores ".", and the buffer area determines that the currently stored "." can be legally converted to the value corresponding to %f , that is, a decimal floating point number (you can test it by yourself by typing %f and type a '.' separately to see the output). The input probe moves to "1" to continue scanning, and the working buffer stores ".1". The buffer determines that the currently stored ".1" can be legally converted into a decimal floating point number. By analogy, until the input probe scans to '.1416\n' or '0' after the 6 of 3.1416, the working buffer area stores ".1416\n", and the buffer area determines the current The stored ".1416\n" cannot be legally converted into a decimal floating point number, so ".1416" is deleted, and ".1416" is converted into a decimal integer 0.1416 and stored in variable b. .

(3) The matching probe cannot continue to move back one position, end scanf, and return.

So, the output we see is the integer 3 and the floating point number 0.141600.

Example 2

enter:

3.1416

Code:

#include <stdio.h>
int main () {
	int a=0,b=2;
	scanf("%d%d", &a, &b);
	printf("%d %d\n", a, b);
	return 0;
}

Output:

3 2

For the above example, the working steps of scanf are as follows:

(0) During initialization, the matching probe position is at '%d'; the input probe is at '3' of "3.1416", and the working buffer is empty.

(1) The matching probe scans to %d, which is a format character.

(2) Input the probe to scan "3", and the working buffer area stores "3". The buffer area determines that the currently stored 3 can be legally converted into the value corresponding to %d, which is a decimal integer, so enter The probe continues scanning. The input probe scans to ".", and the working buffer stores "3.". Obviously "3." cannot be legally converted to an integer. The working cache deletes the "." in "3.", converts "3" into a decimal integer, and stores it in variable a. Move the matching probe back one position and continue with step 1.

(1') The matching probe scanned to %d, which is a format character.

(2') Input the probe to scan ".", and the working cache area stores ".". The cache area determines that the currently stored "." cannot be legally converted to the value corresponding to %f. , that is, a decimal floating point number. Since for the current matching character %d, the first character '.' stored in the working buffer is illegal, so scanf is ended directly and returned.

So, the output we see is the integer 3 and the integer 2 (the value given to b during initialization).

Example 3

enter:

3.1416

Code:

#include <stdio.h>
int main () {
	int a=0,b=2;
	scanf("%d.%d", &a, &b);
	printf("%d %d\n", a, b);
	return 0;
}

Output:

3 1416

For the above example, the working steps of scanf are as follows:

(0) During initialization, the matching probe position is at '%d'; the input probe is at '3' of "3.1416", and the working buffer is empty.

(1) The matching probe scans to '%d', which is a format character.

(2) Input the probe to scan "3", and the working buffer area stores "3". The buffer area determines that the currently stored 3 can be legally converted into the value corresponding to %d, which is a decimal integer, so enter The probe continues scanning. The input probe scans to ".", and the working buffer stores "3.". Obviously "3." cannot be legally converted to an integer. The working cache deletes the "." in "3.", converts "3" into a decimal integer, and stores it in variable a. Move the matching probe back one position and continue with step 1.

(1') The matching probe scans to '.', which is another character.

(2') The input probe scans '.', and the characters scanned by the input probe and the matching probe are the same. Move the input probe back one position, and the matching probe back one position, and continue with step 1.

(1'') The matching probe scans to '%d', which is a format character.

(2'') Input probe scan '1', and the working buffer area stores "1". The buffer area determines that the currently stored 1 can be legally converted to the value corresponding to %d. That is, a decimal integer, so the input probe continues scanning. Until the input probe scans the "\n" after "3.1416", the working buffer stores "1416\n". Obviously "1416\n" cannot be legally converted to an integer. The working cache area will delete "\n", convert "1416" into a decimal integer, and store it in variable b.

(3) The matching probe cannot continue to move back one position, end scanf, and return.

So, the output we see is the integer 3 and the integer 1416 (obviously scanf changed the value of b).

Example 4

enter:

3
1416

Code:

#include <stdio.h>
int main () {
	int a=0,b=2;
	scanf("%d%d", &a, &b);
	printf("%d %d\n", a, b);
	return 0;
}

Output:

3 1416

For the above example, the working steps of scanf are as follows:

(0) During initialization, the matching probe position is at '%d'; the input probe is at '3' of "3\n1416", and the working buffer is empty.

(1) The matching probe scans to '%d', which is a format character.

(2) Input the probe to scan "3", and the working buffer area stores "3". The buffer area determines that the currently stored 3 can be legally converted into the value corresponding to %d, which is a decimal integer, so enter The probe continues scanning. The input probe scans to "\n", and the working buffer stores "3\n". Obviously "3\n" cannot be legally converted to an integer. The working buffer deletes "\n" in "3\n", converts "3" into a decimal integer, and stores it in variable a. Move the matching probe back one position and continue with step 1.

(1') The matching probe scans to '%d', which is a format character.

(2') Input probe scan '\n'. Since this is the blank character before the legal character for %d, it will be ignored. The input probe is moved back one position to continue scanning. Until the input probe scans the "\n" after "1416\n", the working buffer stores "1416\n". Obviously "1416\n" cannot be legally converted to an integer. The working cache area will delete "\n", convert "1416" into a decimal integer, and store it in variable b.

(3) The matching probe cannot continue to move back one position, end scanf, and return.

So, the output we see is the integer 3 and the integer 1416 (obviously scanf changed the value of b).

Please note that (2') in Example 4 and Example 2 respectively. When the current matching character is a format character corresponding to a number, before scanning to the legal input character, the input whitespace character will be ignored, while other input characters will be ignored. Stop typing and return directly.

Example 5

enter:

A
B

Code:

#include <stdio.h>
int main () {
	char a='a',b='b';
	scanf("%c%c", &a, &b);
	printf("%c %c\n", a, b);
	return 0;
}

Output:

A
 

For the above example, the working steps of scanf are as follows:

(0) During initialization, the matching probe position is at "%c"; the input probe is at "A" of "A\nB", and the working buffer is empty.

(1) The matching probe scans to '%c', which is a format character.

(2) Input the probe to scan "A", and the working buffer area stores "A". The buffer area determines that the currently stored "A" can be legally converted into the value corresponding to %c, that is, one character, so Enter the probe to continue scanning. The input probe scans to "\n", and the working buffer stores "A\n". Obviously "A\n" cannot be legally converted into a character. The working cache deletes the "\n" in "A\n", converts "A" into a character, and stores it in the variable a. Move the matching probe back one position and continue with step 1.

(1') The matching probe scans to '%c', which is a format character.

(2') Input probe scan '\n', the working buffer area stores "\n", and the buffer area determines that the currently stored "\n" can be legally converted into a character, so Enter the probe to continue scanning. The input probe scans to "B", and the working buffer stores "\nB". Obviously "\nB" cannot be legally converted into a character. The working cache deletes the "B" in "\nB", converts "\n" into a character, and stores it in the variable b.

(3) The matching probe cannot continue to move back one position, end scanf, and return.

So, the output we see is the character A and a newline character '\n'.

This example also shows that when matching the character %c (including %s), the input probe and the working buffer will not ignore the whitespace characters in the input stream, because the whitespace characters are legal characters for %c.

Furthermore, in the above example, if we want to input 'A' into variable a and 'B' into variable b, we can have two methods: one is to match the string in scanf Reasonably add "\n" in the input stream. The second is to use getchar() to eat the current "\n" of the input probe and move the input probe back one bit. The specific implementation code is as follows:

Code 1:

#include <stdio.h>
int main () {
	char a='a',b='b';
	scanf("%c\n%c", &a, &b);
	printf("%c %c\n", a, b);
	return 0;
}

Code 2:

#include <stdio.h>
int main () {
	char a='a',b='b';
	scanf("%c", &a);
	getchar();
	scanf("%c", &b);
	printf("%c %c\n", a, b);
	return 0;
}

(More examples to be continued...)

Guess you like

Origin blog.csdn.net/morn_l/article/details/134125407