C style file input/output---formatted input/output---(std::vscanf, std::vfscanf, std::vsscanf)

The CI/O subset of the C++ standard library implements C-style stream input/output operations. The <cstdio> header file provides general file support and provides functions with narrow and multibyte character input/output capabilities, while the <cwchar> header file provides functions with wide character input/output capabilities.

Formatted input/output

Read formatted input from stdin, file stream, or buffer using a variable argument list

std::vscanf, 
std::vfscanf, 
std::vsscanf

​int vscanf( const char* format, va_list vlist );​

(1) (since C++11)

int vfscanf( std::FILE* stream, const char* format, va_list vlist );

(2) (since C++11)

int vsscanf( const char* buffer, const char* format, va_list vlist );

(3) (since C++11)

 Read data from various sources, formattranslate and store the results to vlista location defined by .

1) Read data from stdin.

2) streamRead data from the file stream.

buffer3) Read data from null terminated string .

parameter

stream - Input file stream to read
buffer - Pointer to the null-terminated string to read
format - Pointer to a null-terminated string that specifies how to read the input.

The format string consists of the following

  • Non-whitespace multibyte characters, except %: Each such character in the format string handles the exact same character from the input stream, or causes the function to fail if it compares unequal to the next character of the stream.
  • Whitespace: A single whitespace character in any format string handles all available consecutive whitespace characters from the input (as determined by calling isspace in a loop). Note that "\n", " ", "\t\t" or other whitespace in the format string makes no difference.
  • Transformation specifications: Each transformation specification has the following format:
  • Import %characters
  • (Optional) Assignment suppression characters *. If this option is present, this function does not assign the result to any receiving parameters.
  • (Optional) An integer number (greater than zero) that specifies the maximum field width that the function is allowed to process while performing the conversion specified by the current conversion specification. Note that %s and %[ may cause a buffer overflow if no width is provided.
  • (Optional) A length modifier that specifies the size of the received argument , that is, the actual target type. This affects conversion accuracy and overflow rules. The default target type is different for each transformation type (see table below).
  • conversion format specifier

The following format specifiers are available:

conversion
specifier
explain Parameter Type
length modifier hh

(C++11)

h (none) l ll

(C++11)

j

(C++11)

z

(C++11)

t

(C++11)

L
% Match literals %. N/A N/A N/A N/A N/A N/A N/A N/A N/A
c

Matches a character or sequence of characters

If the width specifier is used, matches the exact width characters (the argument must be a pointer to an array with sufficient space). Unlike %s and %[ , it does not append a null character to the array.

N/A N/A

char*

wchar_t*

N/A N/A N/A N/A N/A
s

Matches a sequence of non-whitespace characters (a string )

若使用宽度指定符,则至多匹配宽度个字符,或匹配到首个提前出现的空白符前。总是在匹配的字符后存储一个空字符(故参数数组必须有至少宽度 +1 个字符的空间)。

[set]

匹配一个来自 set 的字符的非空字符序列。

若集合的首字符是 ^ ,则匹配所有不在集合中的字符。若集合以 ]^] 开始,则 ] 字符亦被包含入集合。在扫描集合的非最初位置的字符 - 是否可以指示范围,如 [0-9] ,是实现定义的。若使用宽度指定符,则最多匹配到宽度。总是在匹配的字符后存储一个空字符(故参数数组必须有至少宽度 +1 个字符的空间)。

d

匹配一个十进制整数

该数的格式同 strtol() 以值 10 为 base 时所期望者

signed char* 或 unsigned char*

signed short* 或 unsigned short*

signed int* 或 unsigned int*

signed long* 或 unsigned long*

signed long long* 或 unsigned long long*

intmax_t* 或 uintmax_t*

size_t*

ptrdiff_t*

N/A
i

匹配一个整数

该数的格式同 strtol() 以值 ​0​ 为 base 时所期望者(基底以首个分析的字符确定)

u

匹配一个无符号十进制整数

该数的格式同 strtoul() 以值 10 为 base 参数时所期望者。

o

匹配一个无符号八进制数

该数的格式同 strtoul() 以值 8 为 base 参数时所期望者。

x, X

匹配一个无符号十六进制整数

该数的格式同 strtoul() 以值 16 为 base 参数时所期望者。

n

返回迄今读取的字符数

不消耗输出。不增加赋值计数。若此指定符拥有赋值抑制运算符,则行为未定义。

a, A(C++11)
e, E
f, F
g, G

匹配一个浮点数

该数的格式同 strtof() 所期望者。

N/A N/A

float*

double*

N/A N/A N/A N/A

long double*

p

匹配定义一个指针的实现定义的字符序列。

printf 系列函数应该用 %p 格式指定符产生同样的序列。

N/A N/A

void**

N/A N/A N/A N/A N/A N/A

对于每个异于 n 的转换指定符,不超过任何指定域宽,且要么是转换指定符所准确期待,要么是其所期待的前缀的最长输入字符序列,即是从流中消耗的内容。此消耗序列后的首个字符若存在,则保持未读取。若被消耗序列长度为零,或被消耗序列不能转换成上面所指定的项目,则发生匹配失败,除非遇到文件尾、编码错误,或阻止从流输入的读取错误,此情况下此为输入失败。

所有异于 [cn 的转换指定符,在尝试分析输入前消耗并舍弃所有前导空白字符(如同以调用 isspace 来确定)。这些被消耗的字符不计入指定的最大域宽。

转换指定符 lclsl[ 进行多字节到宽字符转换,如同如同在转换首字符前,通过用初始化到零的 mbstate_t 对象调用 mbrtowc() 。

转换指定符 s[ 始终在匹配字符之后存储一个空字符。目标数组的大小必须至少比指定域宽大一。

定宽整数类型( int8_t 等)的正确的转换指定定义于头文件 <cinttypes> (虽然 SCNdMAX 、 SCNuMAX 等就是 %jd%ju 等的别名)。

在每个转换指定符后有一个序列点;这允许存储多个域到同一“池”变量中。

在分析以无数字指数为结尾的不完整浮点值,如以转换指定符 %f 分析 "100er" 时,消耗序列 "100e" (可能为合法浮点数的最长前缀),并导致匹配错误(被消耗序列不能转换成浮点数),而留下 "r" 。某些既存实现不遵守此规则并回滚,通过消耗 "100" 而留下 "er" ,例如 glibc 漏洞 1765

vlist - 含有接收参数的可变参数列表

返回值

成功读取的参数个数,或若出现失败则为 EOF 。

注意

所有这些函数调用 va_arg 至少一次,返回后 arg 的值不确定。这些函数不调用 va_end ,而这必须由调用方进行。

 调用示例

#include <iostream>
#include <cstdio>
#include <cstdarg>
#include <stdexcept>

void checked_sscanf(int count, const char* buf, const char *fmt, ...)
{
    va_list ap;
    va_start(ap, fmt);
    if (std::vsscanf(buf, fmt, ap) != count)
    {
        throw std::runtime_error("parsing error");
    }
    va_end(ap);
}

int main()
{
    try
    {
        int n, m;
        std::cout << "Parsing '1 2'...";
        checked_sscanf(2, "1 2", "%d %d", &n, &m);
        std::cout << "success\n";
        std::cout << "Parsing '1 a'...";
        checked_sscanf(2, "1 a", "%d %d", &n, &m);
        std::cout << "success\n";
    }
    catch (const std::exception& e)
    {
        std::cout << e.what() << '\n';
    }
    return 0;
}

输出

Guess you like

Origin blog.csdn.net/qq_40788199/article/details/132795536