C style file input/output---formatted input/output--(std::scanf, std::fscanf, std::sscanf)

The CI/O subset of the C++ standard library implements C-style stream input/output operations. The <cstdio> header file provides general file support and provides functions with narrow and multibyte character input/output capabilities, while the <cwchar> header file provides functions with wide character input/output capabilities.

Formatted input/output

Read formatted input from stdin, file stream or buffer

std::scanf, 
std::fscanf, 
std::sscanf

​int scanf( const char* format, ... );​

(1)

int fscanf( std::FILE* stream, const char* format, ... );

(2)

int sscanf( const char* buffer, const char* format, ... );

(3)

 Read data from various sources, formattranslate and store the results at a given location.

1) Read data from stdin

2) streamRead data from file stream

buffer3) Read data from null terminated string

parameter

stream - Read file stream from source
buffer - A null-terminated string pointing to the source of the data to be read from
format - Pointer to a null-terminated string that specifies how to read the input.

The format string consists of the following

  • Non-whitespace multibyte characters, except %: Each such character in the format string handles the exact same character from the input stream, or causes the function to fail if it compares unequal to the next character of the stream.
  • Whitespace: A single whitespace character in any format string handles all available consecutive whitespace characters from the input (as determined by calling isspace in a loop). Note that "\n", " ", "\t\t" or other whitespace in the format string makes no difference.
  • Transformation specifications: Each transformation specification has the following format:
  • Import %characters
  • (Optional) Assignment suppression characters *. If this option is present, this function does not assign the result to any receiving parameters.
  • (Optional) An integer number (greater than zero) that specifies the maximum field width that the function is allowed to process while performing the conversion specified by the current conversion specification. Note that %s and %[ may cause a buffer overflow if no width is provided.
  • (Optional) A length modifier that specifies the size of the received argument , that is, the actual target type. This affects conversion accuracy and overflow rules. The default target type is different for each transformation type (see table below).
  • conversion format specifier

The following format specifiers are available:

conversion
specifier
explain Parameter Type
length modifier hh

(C++11)

h (none) l ll

(C++11)

j

(C++11)

z

(C++11)

t

(C++11)

L
% Match literals %. N/A N/A N/A N/A N/A N/A N/A N/A N/A
c

Matches a character or sequence of characters

If the width specifier is used, matches the exact width characters (the argument must be a pointer to an array with sufficient space). Unlike %s and %[ , it does not append a null character to the array.

N/A N/A

char*

wchar_t*

N/A N/A N/A N/A N/A
s

Matches a sequence of non-whitespace characters (a string )

If the width specifier is used, it matches at most width characters, or until the first preceding whitespace character. A null character is always stored after the matched character (so the parameter array must have at least a width + 1 character of space).

[set]

Matches a non-empty sequence of characters from set.

若集合的首字符是 ^ ,则匹配所有不在集合中的字符。若集合以 ]^] 开始,则 ] 字符亦被包含入集合。在扫描集合的非最初位置的字符 - 是否可以指示范围,如 [0-9] ,是实现定义的。若使用宽度指定符,则最多匹配到宽度。总是在匹配的字符后存储一个空字符(故参数数组必须有至少宽度 +1 个字符的空间)。

d

匹配一个十进制整数

该数的格式同 strtol() 以值 10 为 base 时所期望者

signed char* 或 unsigned char*

signed short* 或 unsigned short*

signed int* 或 unsigned int*

signed long* 或 unsigned long*

signed long long* 或 unsigned long long*

intmax_t* 或 uintmax_t*

size_t*

ptrdiff_t*

N/A
i

匹配一个整数

该数的格式同 strtol() 以值 ​0​ 为 base 时所期望者(基底以首个分析的字符确定)

u

匹配一个无符号十进制整数

该数的格式同 strtoul() 以值 10 为 base 参数时所期望者。

o

匹配一个无符号八进制数

该数的格式同 strtoul() 以值 8 为 base 参数时所期望者。

x, X

匹配一个无符号十六进制整数

该数的格式同 strtoul() 以值 16 为 base 参数时所期望者。

n

返回迄今读取的字符数

不消耗输出。不增加赋值计数。若此指定符拥有赋值抑制运算符,则行为未定义。

a, A(C++11)
e, E
f, F
g, G

匹配一个浮点数

该数的格式同 strtof() 所期望者。

N/A N/A

float*

double*

N/A N/A N/A N/A

long double*

p

匹配定义一个指针的实现定义的字符序列。

printf 系列函数应该用 %p 格式指定符产生同样的序列。

N/A N/A

void**

N/A N/A N/A N/A N/A N/A

对于每个异于 n 的转换指定符,不超过任何指定域宽,且要么是转换指定符所准确期待,要么是其所期待的前缀的最长输入字符序列,即是从流中消耗的内容。此消耗序列后的首个字符若存在,则保持未读取。若被消耗序列长度为零,或被消耗序列不能转换成上面所指定的项目,则发生匹配失败,除非遇到文件尾、编码错误,或阻止从流输入的读取错误,此情况下此为输入失败。

所有异于 [cn 的转换指定符,在尝试分析输入前消耗并舍弃所有前导空白字符(如同以调用 isspace 来确定)。这些被消耗的字符不计入指定的最大域宽。

转换指定符 lclsl[ 进行多字节到宽字符转换,如同如同在转换首字符前,通过用初始化到零的 mbstate_t 对象调用 mbrtowc() 。

转换指定符 s[ 始终在匹配字符之后存储一个空字符。目标数组的大小必须至少比指定域宽大一。

定宽整数类型( int8_t 等)的正确的转换指定定义于头文件 <cinttypes> (虽然 SCNdMAX 、 SCNuMAX 等就是 %jd%ju 等的别名)。

在每个转换指定符后有一个序列点;这允许存储多个域到同一“池”变量中。

在分析以无数字指数为结尾的不完整浮点值,如以转换指定符 %f 分析 "100er" 时,消耗序列 "100e" (可能为合法浮点数的最长前缀),并导致匹配错误(被消耗序列不能转换成浮点数),而留下 "r" 。某些既存实现不遵守此规则并回滚,通过消耗 "100" 而留下 "er" ,例如 glibc 漏洞 1765

... - 接收的参数

返回值

成功赋值的参数数(在首个参数赋值前发生匹配失败的情况下可为零),或若在赋值首个接收的参数前输入失败则为 EOF 。

注意

因为大多数转换指定符首先消耗所有连续空白符,如下代码

std::scanf("%d", &a);
std::scanf("%d", &b);

会读取输入于不同行(第二个 %d 会消耗第一个剩下的换行符)或同一行中为空格或制表符所分隔的(第二个 %d 会消耗空格或制表符)二个整数。

不消耗前导空白符的转换指定符,如 %c ,可通过在格式化字符串中用空白符使得它这么做:

std::scanf("%d", &a);
std::scanf(" %c", &c); // 忽略 %d 后的换行符,然后读一个 char

调用示例

#include <iostream>
#include <clocale>
#include <cstdio>

int main()
{
    int i, j;
    float x, y;
    char str1[10], str2[4];
    wchar_t warr[2];
    std::setlocale(LC_ALL, "en_US.utf8");

    char input[] = u8"25 54.32E-1 Thompson 56789 0123 56ß水";
    // 按如下分析:
    // %d :整数
    // %f :浮点值
    // %9s :至多 9 个非空白字符的字符串
    // %2d :二位整数(数字 5 与 6 )
    // %f :浮点值(数字 7 、 8 、 9 )
    // %*d 不存储于任何位置的整数
    // ' ' :所有连续空白符
    // %3[0-9] :至多 3 位数字的字符串(数字 5 与 6 )
    // %2lc :二个宽字符,用多字节到宽转换
    int ret = std::sscanf(input, "%d%f%9s%2d%f%*d %3[0-9]%2lc",
                          &i, &x, str1, &j, &y, str2, warr);

    std::cout << "Converted " << ret << " fields:\n"
              << "i = " << i << "\nx = " << x << '\n'
              << "str1 = " << str1 << "\nj = " << j << '\n'
              << "y = " << y << "\nstr2 = " << str2 << '\n'
              << std::hex << "warr[0] = U+" << warr[0]
              << " warr[1] = U+" << warr[1] << '\n';
    return 0;
}

输出

Guess you like

Origin blog.csdn.net/qq_40788199/article/details/132795424
Recommended