[Side skills] C language programming to judge the legality of IPV4 addresses (using regular expressions)

C language programming to realize the legality judgment of IPV4 address (using regular expressions)

Friends who have known me may have some impressions. I wrote about this topic in my blog N years ago. At that time, I really encountered this problem at work. I thought that after the work was finished, I would add the code to solve this problem, but it turned out to be a pigeon for several years. I am really ashamed. Now this part of the code is made public, everyone is welcome to download and test.

1 written in front

I wrote a blog post before, which mainly introduced how to use C language programming to judge the legality of IPV4 addresses, but that time I used native C language functions to achieve it. This blog post will introduce to you how to In the environment of C language programming, use regular expressions to complete this functional requirement.

2 What is a regular expression

A regular expression is a tool for matching text patterns. It can describe a text pattern through some specific characters and grammatical rules, and find strings that match the pattern in the text. Regular expressions can be used in various application scenarios such as text search, replacement, and verification, and are one of the necessary tools for programmers and text processing workers.

Regular expressions, in some high-level programming languages, have mature library interfaces to call, but in C language, there are few such examples. But that doesn't mean it can't be used, in fact it can be used.

There are a few caveats to be aware of when using regular expressions in C:

  1. First you need to include regex.hthe header file.
  2. Before using a regular expression, regcomp()a function needs to be used to compile the regular expression into a pattern.
  3. When using regexec()functions to perform regular expressions, you can use REG_EXTENDEDoptions to enable extended regular expressions, or use REG_ICASEoptions to ignore case.
  4. When using regexec()the function, if the return value is 0, it means that the match is successful; if the return value is 0 REG_NOMATCH, it means that there is no match; if the return value is other values, it means that an error occurred.
  5. When using regerror()the function to get error information, you need to provide a buffer and buffer size.
  6. After using the regular expression, you need to use regfree()the function to release the compiled pattern.
  7. Special characters in regular expressions need to be escaped, eg .need to be written as \., otherwise it will match any character.
  8. Brackets in regular expressions can be used for grouping, for example ([0-9]{1,3}\.){3}[0-9]{1,3}to match an IP address.
  9. When using regular expressions, you need to pay attention to performance issues, because regular expression matching may consume a lot of CPU resources. Consider using a simpler string-matching algorithm, such as strstr()a function.

For more information about regular expressions, please refer to: Regular Expression Language - Quick Reference | Microsoft Learn

3 Demand Analysis

In fact, the requirement of this topic is very simple, that is, input a string and judge whether it is a legal IPv4 address. Just from the function point of view, it seems very simple, but it needs some effort to make it perfect. If you don't believe me, take a look at the disassembly below.

Image result for IPV4

4 C language version (regular expression)

Let's start with a simple version and look directly at the code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

#include <regex.h>

static int is_valid_ipv4_regex(const char *ip_address)
{
    
    
    regex_t regex;
    int reti;

    // Compile regular expression
    reti = regcomp(&regex, "^([0-9]{1,3}\\.){3}[0-9]{1,3}$", REG_EXTENDED);
    if (reti) {
    
    
        fprintf(stderr, "Could not compile regex\n");
        reti = 0;
    }

    // Execute regular expression
    printf("%s\n", ip_address);
    reti = regexec(&regex, ip_address, 0, NULL, 0);
    if (!reti) {
    
    
        printf("Valid IP address %d\n", reti);
        reti = 1;
    } else if (reti == REG_NOMATCH) {
    
    
        printf("Invalid IP address xxx %d\n", reti);
        reti = 0;
    } else {
    
    
        char error_message[100];
        regerror(reti, &regex, error_message, sizeof(error_message));
        fprintf(stderr, "Regex match failed: %s\n", error_message);
        reti = 0;
    }

exit_entry:

    // Free compiled regular expression
    regfree(&regex);

    return reti;
}


int is_valid_ipv4(const char *ip_address) 
{
    
    
    int num, dots = 0;
    char *ptr;

    if (ip_address == NULL) {
    
    
        return 0;
    }

    ptr = strtok((char *)ip_address, ".");
    if (ptr == NULL) {
    
    
        return 0;
    }

    while (ptr) {
    
    
        if (!isdigit(*ptr)) {
    
    
            return 0;
        }

        if (*ptr == '0') {
    
     //check start '0'
            return 0;
        }

        num = atoi(ptr);
        if (num < 0 || num > 255) {
    
    
            return 0;
        }

        ptr = strtok(NULL, ".");
        if (ptr != NULL) {
    
    
            dots++;
        }
    }

    if (dots != 3) {
    
    
        return 0;
    }

    if (atoi(ip_address) >= 1 && atoi(ip_address) <= 126) {
    
    
        printf("This is a Class A IP address.\n");
        return 1;
    } else if (atoi(ip_address) >= 128 && atoi(ip_address) <= 191) {
    
    
        printf("This is a Class B IP address.\n");
        return 1;
    } else if (atoi(ip_address) >= 192 && atoi(ip_address) <= 223) {
    
    
        printf("This is a Class C IP address.\n");
        return 1;
    } else {
    
    
        printf("This is not a Class A, B, or C IP address.\n");
        return 0;
    }

    return 1;
}

int check_is_valid_ipv4(const char *ip)
{
    
    
    int ret = 0;
    
    //ret = is_valid_ipv4(ip); 
    ret = is_valid_ipv4_regex(ip); 

    return ret; 
}

int main(int argc, const char *argv[])
{
    
    
    const char *ip = argv[1];

    printf("check %s\n", ip);
    printf("ret %d\n", check_is_valid_ipv4(ip));
}

Compile and run it. It is no problem to input a common ipv4 address, such as "192.168.0.1"; at the same time, an error will be reported if an illegal character is input.

recan@ubuntu:~$ 
recan@ubuntu:~$ 
recan@ubuntu:~$ ./test 192.168.1.3
check 192.168.1.3
192.168.1.3
Valid IP address 0
ret 1
recan@ubuntu:~$ ./test 192.168.1.t
check 192.168.1.t
192.168.1.t
Invalid IP address xxx 1
ret 0
recan@ubuntu:~$ 
recan@ubuntu:~$ ./test 192.168.1.oo
check 192.168.1.oo
192.168.1.oo
Invalid IP address xxx 1
ret 0
recan@ubuntu:~$ 
recan@ubuntu:~$ ./test 192.168.01.8
check 192.168.01.8
192.168.01.8
Valid IP address 0
ret 1
recan@ubuntu:~$ 
recan@ubuntu:~$

But careful friends may find each other that when a certain segment of the IP address has a leading 0, it seems to be judged as the correct IP address, but in fact we generally do not write it like this.

So how can we avoid this situation?

Is it possible to do it via regex?

This question is left to the readers to explore by themselves, it is a very interesting regular expression learning.

As long as this function is completed, I believe that you will be able to master regular expressions more deeply.

5 complete test cases

This section will add various test cases for you, hoping to help you test the code:

合法的测试输入
192.168.0.1
10.0.0.1
172.16.0.1
255.255.255.255

非法的测试输入
256.0.0.1
192.168.0.0.1
192.168.0
192.168.0.1.2

非法的测试输入
256.0.0.1
192.168.0.0.1
192.168.0
192.168.0.1.2
300.300.300.300
1.2.3
1.2.3.4.5
1.2.3.4.
.1.2.3.4
1..2.3.4

The test cases are constantly enriched, and everyone is welcome to add them.

Guess you like

Origin blog.csdn.net/szullc/article/details/130836188