An article to get js regular expressions

There are many ways for us to test whether the regular expression is correct, for example, find the matching string through the regular expression:

Click the third button in the search box in the vscode editor to achieve:

 

 Or it can also be implemented in the console in the browser:

 

 We can use the following online website to test whether the regularization you wrote is correct----recommended

https://regex101.com/

The following is a screenshot of the website, how to use it:

 Let's learn about regular expressions formally:

1. Qualifier

1. Qualifier - question mark?

used?

This question mark means that the character d in front of it needs to appear 0 or 1 time, that is, the character d is optional

You can see used? can match use or used

 2. Qualifier--asterisk *

ab*c

He will match 0 or more characters, which means that b can appear multiple times

 3. Qualifier - plus sign +

ab+c

He will match characters that appear more than 1 time, that is, match b characters that appear 1 or more times

 4. Exact match of qualifier --{}

ab{6}c

We want to specify that the number of occurrences of b here is 6 times , so write 6 in the curly braces,

ab{2,6}c

Of course curly braces also allow us to enter a range. For example, we want the number of characters to appear to be between 2 and 6, which means that it contains 2 and 6

ab{2,6}c

 ab{2,}c

There is also this way of writing, which means that the number of matching b is more than 2 times.

 Note: What we talked about above are all matching single characters. What if we want to match multiple characters?

Answer: Enclose multiple characters in brackets and add qualifiers to match.

 Two, "or" operation

a (cat | dog)

Note: parentheses are required

3. Character class

[abc]+

The content in the square brackets here means that the characters you are required to match can only be taken from them, so as long as one of them is included, it will match

 Additionally, we can specify a range of characters in square brackets

[a-z]

Represents all lowercase English characters

 [a-zA-Z0-9]+

Matches the characters of lowercase az, uppercase AZ and numbers

 caret caret ^ 

If we write a caret in front of the square brackets, it means that it is required to match the characters listed after the caret (other than)

[^0-9] 

Represents all non-numeric characters, including newlines

 4. Metacharacters

Most metacharacters in regular expressions start with a backslash

\d  

Numeric characters, equivalent to [0-9]

\w 

Represents word characters (English, numbers and underscores)

\s 

Whitespace (including tab and newline)

\D

Contrary to lowercase d, representing a non-numeric character

\W 

represent non-word characters

 \S

represents a non-blank character

\b 

Represents the beginning or end of a word, that is, the boundary between words

 Dot. 

represents any character, excluding newlines

 special character caret ^ 

will match ad at the beginning of the line

 special character dollar sign $

will match ad at the end of the line

 Five, greedy and lazy matching

For example, we want to match the tags before and after the span

Someone said, write like this

 At this point we can see that this match will match the entire span tag

The reason is because the period symbol will match as many characters as possible, so we add a question mark, which means 0 or 1 occurrence,

In this way, the match is successful.

 Next, let's look at a simple example:

1. Color value matching

Match the RGB color values ​​in hexadecimal in all colors

/#[a-fA-F0-9]{6}\b/

Analysis: Because the hexadecimal value is between af, including uppercase and lowercase, and is 6 digits, then we use \b to indicate the end of the word

 2. IP address matching

\d+ will match any number whose length is greater than 1

\. Represents the period symbol, because the period is a special character in regular expressions, so it needs to be escaped with a backslash

 At this point we see that a wrong ip is also matched, because ip is a number from 0-255

So the above scheme is not feasible, see the following scheme, we match bitwise

25[0-5] The first type starts with 25, and the third must be between 0-5

2[0-4]\d+ The second type starts with 2, the second digit is 0-4, the third digit can be any number between 0-9 and replaced by \d

[01]\d\d The third type starts with 0 or 1, then the second and third digits can take any number from 0-9, and replace them with \d

We know that each part of the ip address can be two digits or one digit, so we can add question marks at the beginning and end of the third case: [01]?\d\d?

The number part has been matched, and the period is matched next

(25[0-5]|2[0-4]\d|[01]?\d\d?)\.)

Because there are only three periods, we only use {3} to limit the first three parts of the regex with periods, and the last part can just copy the previous ones

\b((25[0-5]|2[0-4]\d|[01]?\d\d?)\.){3}(25[0-5]|2[0-4]\d|[01]?\d\d?)\b

 

 In this way, the screening is completed, and the IP 256.1.1.0 that does not meet the requirements is successfully filtered out.

Summarize:

 Have you learned it~

Guess you like

Origin blog.csdn.net/qq_41579104/article/details/129753567