We often need to write code to validate user input, such as whether the input is a number, whether it is a mailbox, and so on. A simple and effective way to write this type of code is to use regular expressions.
A regular expression is a string used to describe a pattern that matches a set of strings. We can use regular expressions to match, replace and split strings.
Match string
Let me talk about the matches method in the String class: at first glance, the matches method is very similar to the equals method.
"Test".matches("Test"); //True
"Test".equals("Test"); //True
Both methods return True, but the matches method is actually more powerful. It can not only match the fixed string of the above code, but also match the string of a pattern.
"Test True or False".matches("Test.*") //True
"Test False or True".matches("Test.*") //True
The "Test.*" of the above code is a regular expression. It describes the pattern of a string: starting from Test, followed by any string. .* matches any 0 or more characters.
Regular expression syntax
Regular expressions consist of characters and special symbols. Backslash is a special character (escape character), so use \d to mean \d (double escape).
Regular expression | match | Example |
---|---|---|
x | Specify character x | test->test |
. | Any single character | test->t…t |
(ab | cd) | ab or cd | asd->a(sd|qeicbn) |
[abc] | a or b or c | test->te[asdf]t |
[^abc] | Any character except a or b or c | test->te[^qwe]t |
[a-z] | any character from a to z | test->t[a-z]st |
[^a-z] | Any character except a to z | test->tes[^a-s] |
[a-e[m-p]] | a to e or m to p | Test->[F-U[I-M]]es[f-u] |
[a-e&&[c-p]] | a to e and c to p | Test->[A-O&&[R-Z]]es[a-z] |
\d | One digit, same as [0-9] | test1->test[\\d] |
\D | A non-digit | test -> [\\ D] is |
\w | Word character | test -> [\\ w] es [\\ w] |
\W | Non-word characters | test1->test[\\W] |
\s | Blank character | your t-> your \\ st |
\S | Non-whitespace character | tes t->[\\S]es t |
p* | any occurrence of p | tttest-> t * is ~ testtesttest -> (test) * |
p+ | p appears at least once | e->e+t* ~ test->(te)+.* |
p? | p appears at most once | test-> t? is |
p{n} | p occurs exactly n times | test->t{1}e.* |
p{n, } | p appears at least n times | tttt->t{1,} |
p{n,m} | p appears n to m times | tttt->t{1,5} |
- Word characters are all letters, numbers and underscores. \w is equivalent to [az[AZ][0-9]_]. \W is equivalent to [^a-Za-z0-9]
- There can be no spaces in p{n,} and p{n,m},
A{3, 6}(comma and 6 have spaces) - You can use parentheses () to group patterns, such as (ab)3->ababab, but ab{3}->abbb
For example: the match of QQ number is [1-9][0-9]{4,}. Analysis: The qq number cannot start with 0, right? So [1-9] means that the first digit is a single digit from 1 to 9, and [0-9]{4,} means that the rest are digits from 0 to 9, but at least 4 digits (qq The number is at least 5 digits, here 4 digits + the first 1 digit).
In addition, there are websites that match regular expressions online, such as this one . You can test whether your regular expression matches, and there are some common regular expression examples.
Replace and split strings
Use the matches method of the String class to match the regular expression. If it can match, it returns true, otherwise it is false. The String class also includes repalceAll, replaceFirst, and split methods, which can be used to replace and split strings.
System.out.println("Test Test Test".replaceAll("s\\w", "ok")); //Teok Teok Teok
System.out.println("Test Test Test".replaceFirst("s\\w", "ok")); //Teok Test Test
String[] test = "Test1test2TEST".split("\\d");
for(String t : test){
System.out.println(t);
} //Test test TEST
The split method followed by the second parameter (split(string, number of matches)) is used to determine how many times the pattern is matched.
In addition, the quantifiers are greedy by default, that is, if you want to match multiple times (the specific number is not specified), then the default is to match at most times. You can add? After the quantifier to make it match the least. such as
System.out.println("Teeestt".replaceFirst("e+", "O")); //Tostt
System.out.println("Teeestt".replaceFirst("e+?", "o")); //Toeestt
Finally: Regular expressions are a very useful tool, and more practice is needed to master it.