Detailed explanation of sscanf function usage

I used to think that sscanf can only be used to simply extract strings separated by spaces. I know that I have encountered some string processing problems before. After a detailed study, this function is still very powerful and has many functions similar to regular expressions. . First, let's look at the function definition:
define the function int sscanf (const char *str,const char *format,........);       
 function description         
                  sscanf() will set the string of the parameter str according to the parameter format string to transform and format the data. Please refer to scanf() for the format conversion format. The converted result is stored in the corresponding parameter.
                 The return value returns the number of parameters if it succeeds, or -1 if it fails. The cause of the error is stored in errno. Return 0 to indicate failure, otherwise, indicate the number of correctly formatted data For example: sscanf(str, "%d%d%s", &i,&i2, &s); If all three become successfully read, it will return 3. Returns 1 if only the first integer was read into i. Prove that the second integer cannot be read from str.
The definition is too abstract, let's take a look at the common usage first:
(1) sscanf("zhoue3456 ", "%4s", str); //Take a string of specified length       
          printf("str=%s\n ", str); //str="zhou";
(2) sscanf("zhou456 hedf", "%[^ ]", str); //Get the string up to the specified character,    

(3) sscanf("654321abcdedfABCDEF", "%[1-9a-z]", str); //Get a string containing only the specified character set
          printf("str=%s\n", str); // str=654321abcded, only take numbers and lowercase characters
(4) sscanf("BCDEF123456abcdedf", "%[^az]", str); //Get the string up to the specified character set       
           printf("str=%s\n ", str); // str=BCDEF123456, take the string up to the capital letter
(5) int a,b,c;
          sscanf("2015.04.05", "%d.%d.%d", &a ,&b,&c); //Get the desired string   
          printf("a=%d,b=%d,c=%d",a,b,c); // a=2015,b=4,c =5
Through the above examples, I believe that everyone will have an intuitive understanding of the usage of sscanf. Let's take a look at a more complex example:
(6) Given a string "abcd&hello$why", now I think What about taking out the string between & and $
        sscanf("abcd&hello$why", "%*[^&]&%[^$]", str );<span style="white-space:pre"></span>printf("str=%s\n",str);  //str="hello"
       Where %[] is similar to a regular expression, such as [az] means to read all az characters, [^az] means to read all non-lowercase characters. Then here %*[^&] means to filter out abcd first, then separate it with &, and hello$why is left behind, and then extract the non-$ characters before $ into str.
(7) Given a string "what, time", what if I want to keep only time? (, followed by a space)
         sscanf("what, time", "%*s%s", str );<span style="white-space:pre"></span>printf("str=%s\ n",str); //str="time"
         where %*s represents the first matched string what, which is filtered out. If there is no space, the result is NULL. In fact, "what, time" is split into two strings "what," and "time" by spaces.
However, some people may ask, what is the difference and connection between scanf and sscanf in C language? OK, sscanf and scanf are really similar, both are for input. It's just that the latter takes the screen stidin as the input source, while the former takes the string as the input source, that's all.
Function prototype: int scanf( const char *format [,argument]... );
where format can be one or more {%[*] [width] [{h | l | I64 | L}]type | ' ' | '/t' | '/n' | not a % symbol}, Note: {a|b|c} means one of a, b, c, [d] means there can be d or no d.
width: width, generally can be ignored, usage such as: const char sourceStr[] = "hello, world"; char buf[10] = ; sscanf(sourceStr, "%5s", buf); //%5s, only take 5 characters
cout << buf<< endl; 
the result is: hello
{h | l | I64 | L}: the size of the parameter, usually h means single-byte size, I means 2-byte size, L means 4-byte size ( double exception), l64 means 8 bytes size. 
type : This is a lot, it is %s, %d and so on. 
Special: %*[width] [{h | l | I64 | L}]type means that those satisfying the condition are filtered out and no value will be written to the target parameter.
For example: const char sourceStr[] = "hello, world"; char buf[10] = ; 
             sscanf(sourceStr, "%*s%s", buf); //%*s means the first matched %s is filtered out, that is, hello is filtered
          cout << buf<< endl; The result is: world 
supports set operations: %[az] means match any character from a to z, greedy (match as many as possible)
                            %[aB '] matches one of a, B, ', greedy

                               %[^a] matches any character other than a, greedy


C++代码:参考链接:  http://kmplayer.iteye.com/blog/556293

<span style="font-family: Arial, Helvetica, sans-serif;">1,sscanf():从一个字符串中读进与指定格式相符的数据.</span>
2,sscanf与scanf类似,都是用于输入的,只是后者以屏幕(stdin)为输入源,前者以固定字符串为输入源。
3,关于正则表达式:
    (1)%[..],当字符属于方括号里表达式表示的字符集时继续读取,否则停止.方括号里的和正则表达式差不多,^是"排除..."的意思
    (2)%*[..],直接跳过方括号里的字符集并继续读取
<pre name="code" class="cpp">#include <iostream>
using namespace std;

int main()
{
    char str[10];
    for (int i = 0; i < 10; i++) str[i] = '!';
    cout<<str<<endl;
    sscanf("123456","%s",str);//---------str的值为 "123456\0!!!"
    //这个实验很简单,把源字符串"123456"拷贝到str的前6个字符,并且把str的第7个字符设为null字符,也就是\0
    cout<<str<<endl;

    for (int i = 0; i < 10; i++) str[i] = '!';
    sscanf("123456","%3s",str); //---------str的值为 "123\0!!!!!!"
    //看到没有,正则表达式的百分号后面多了一个3,这告诉sscanf只拷贝3个字符给str,然后把第4个字符设为null字符。
    cout<<str<<endl;

    for (int i = 0; i < 10; i++) str[i] = '!';
    sscanf("aaaAAA","%[a-z]",str);// ---------str的值为 "aaa\0!!!!!!"
    //从这个实验开始我们会使用正则表达式,括号里面的a-z就是一个正则表达式,它可以表示从a到z的任意字符,
    //在继续讨论之前,我们先来看看百分号表示什么意思,%表示选择,%后面的是条件,比如实验1的"%s",s是一个条件,表示任意字符,"%s"的意思是:只要输入的东西是一个字符,就把它拷贝给str。实验2的"%3s"又多了一个条件:只拷贝3个字符。实验3的“%[a-z]”的条件稍微严格一些,输入的东西不但是字符,还得是一个小写字母的字符,所以实验3只拷贝了小写字母"aaa"给str,别忘了加上null字符。
    cout<<str<<endl;

    for (int i = 0; i < 10; i++) str[i] = '!';
    sscanf("AAAaaaBBB","%[^a-z]",str);// ---------str的值为 "AAA\0!!!!!!"
    //对于所有字符,只要不是小写字母,都满足"^a-z"正则表达式,符号^表示逻辑非。前3个字符都不是小写字符,所以将其拷贝给str,但最后3个字符也不是小写字母,为什么不拷贝给str呢?这是因为当碰到不满足条件的字符后,sscanf就会停止执行,不再扫描之后的字符。
    cout<<str<<endl;

    /*
    for (int i = 0; i < 10; i++) str[i] = '!';
    sscanf("AAAaaaBBB","%[A-Z]%[a-z]",str);// ---------段错误
    //这个实验的本意是:先把大写字母拷贝给str,然后把小写字母拷贝给str,但很不幸,程序运行的时候会发生段错误,因为当sscanf扫描到字符a时,违反了条件"%[A-Z]",sscanf就停止执行,不再扫描之后的字符,所以第二个条件也就没有任何意义,这个实验说明:不能使用%号两次或两次以上
    cout<<str<<endl;
    */

    for (int i = 0; i < 10; i++) str[i] = '!';
    sscanf("AAAaaaBBB","%*[A-Z]%[a-z]",str); //---------str的值为 "aaa\0!!!!!!"
    //这个实验出现了一个新的符号:%*,与%相反,%*表示过滤满足条件的字符,在这个实验中,%*[A-Z]过滤了所有大写字母,然后再使用%[a-z]把之后的小写字母拷贝给str。如果只有%*,没有%的话,sscanf不会拷贝任何字符到str,这时sscanf的作用仅仅是过滤字符串。
    cout<<str<<endl;

    for (int i = 0; i < 10; i++) str[i] = '!';
    sscanf("AAAaaaBBB","%[a-z]",str);// ---------str的值为 "!!!!!!!!!!"
    //做完前面几个实验后,我们都知道sscanf拷贝完成后,还会在str的后面加上一个null字符,但如果没有一个字符满足条件,sscanf不会在str 的后面加null字符,str的值依然是10个惊叹号。这个实验也说明了,如果不使用%*过滤掉前面不需要的字符,你永远别想取得中间的字符。
    cout<<str<<endl;

    for (int i = 0; i < 10; i++) str[i] = '!';
    sscanf("AAAaaaBC=","%*[A-Z]%*[a-z]%[^a-z=]",str); //---------str的值为 "BC\0!!!!!!!"
    //这是一个综合实验,但这个实验的目的不是帮我们复习前面所学的知识,而是展示两个值得注意的地方:
    //注意1:%只能使用一次,但%*可以使用多次,比如在这个实验里面,先用%*[A-Z]过滤大写字母,然后用%*[a-z]过滤小写字母。
    // 注意2:^后面可以带多个条件,且这些条件都受^的作用,比如^a-z=表示^a-z且^=(既不是小写字母,也不是等于号)。
    cout<<str<<endl;

    for (int i = 0; i < 10; i++) str[i] = '!';
    int k;
    sscanf("AAA123BBB456", "%*[A-Z]%i", &k); //---------k的值为123
    //首先,%*[^0-9]过滤前面非数字的字符,然后用%i把数字字符转换成int型的整数,拷贝到变量k,注意参数必须使用k的地址。    cout<<str<<endl;
    cout<<k<<endl;
    return 0;
}

参考链接:

    http://blog.csdn.net/jackyvan/article/details/5349724

    http://kmplayer.iteye.com/blog/556293


 
 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324516069&siteId=291194637