Summary: Regular expressions (java demo)

Recently did something reptiles match, incidentally brushed up regular expressions. Finishing demo with java.

A single character match

1, matches any character

Here we use a point to match any character that is (.)

System.out.println("abc".matches("a.c"));
System.out.println("a$c".matches("a.c"));
System.out.println("acc".matches("a.c"));

The output here is true.

2, matching numbers

Used herein \ d to match, where \\ d above a first slash is escaped intention \ d. The above result is true, the following is false.

System.out.println("007".matches("00\\d"));
System.out.println("00a".matches("00\\d"));

3, matching commonly used characters

Here it is a commonly used characters refer to the letters , numbers or underscores. With \ w, he said there is true.

System.out.println("javac".matches("java\\w"));
System.out.println("java_".matches("java\\w"));
System.out.println("java1".matches("java\\w"));

4, matching space character

With \ s matches a space character, not only the attention space characters including spaces, further comprising a tab character (in Java by \ t denotes)

System.out.println("java ".matches("java\\s"));

5, the non-matching digital

With \ d matches a number, and \ D is a non-matching number. For example, 00 \ D matches.

Matching rules

Regular Expressions

rule

Can match

A

Specified character

A

\u548c

Specifies the Unicode character

with

.

Any character

a,b,&,0

\d

Numbers 0 to 9

0~9

\w

Uppercase and lowercase letters, numbers and underscores

a~z,A~Z,0~9,_

\s

Space, Tab key

Space, Tab

\D

Non-numeric

a,A,&,_,……

\W

Non \ w

&,@,in,……

\S

Non \ s

a,A,&,_,……

Second, the repeat match

* The use of modifiers can match any character including 0 characters.

System.out.println("java".matches("java\\d*"));
System.out.println("java1".matches("java\\d*"));
System.out.println("java11".matches("java\\d*"));
System.out.println("java123".matches("java\\d*"));

Above it is true.

+ Using modifiers can match at least one character.

The use of modifier? Matches zero or one character.

Specify exactly n characters

System.out.println("java123".matches("java\\d{3}"));
System.out.println("java1".matches("java\\d{3}"));

Matching rules

Regular Expressions

rule

Can match

A*

Any number of characters

空,A,AA,AAA,……

A+

At least one character

A,AA,AAA,……

A?

0 or 1 characters

Empty, A

A{3}

Specifies the number of characters

AAA

A{2,3}

The number of characters specified range

AA, AAA

A{2,}

At least n characters

AA, AAA, AAAA, ......

A{0,3}

Up to n characters

空,A,AA,AAA

Third, complicated match

(1) beginning and end of match

Represents the beginning of a ^, $ represents the end. For example, ^ A \ d {3} $, matches "A001", "A380".

(2) matches the specified range

May be used in the matching characters within the scope of the [...], [123456789] \ d {6,7}, [1-9], [0-9a-fA-F], [^ 1-9] {3}

(3) rule matching or

With | two regular rule or rules are connected, for example, AB | CD represent can match AB or CD.

(4) Use parentheses

String re = "learn\\s(java|php|go)";
System.out.println("learn java".matches(re));
System.out.println("learn Java".matches(re));
System.out.println("learn php".matches(re));
System.out.println("learn Go".matches(re));

 

 

Regular Expressions

rule

Can match

^

beginning

Beginning of a string

$

end

End of the string

[ABC]

[...] an arbitrary character

A,B,C

[A-F0-9xy]

Specified range of characters

A,……,F,0,……,9,x,y

[^A-F]

Any characters outside the specified range

Non-A ~ F

AB | CD | EF

AB or CD or EF

AB, CD, EF

 

Fourth, the grouping match

Packet matches very important point is taken substring.

Pattern p = Pattern.compile("(\\d{3,4})\\-(\\d{7,8})");
Matcher m = p.matcher("010-12345678");
if (m.matches()) {
    String g1 = m.group(1);
    String g2 = m.group(2);
    System.out.println(g1);
    System.out.println(g2);
} else {
    System.out.println("匹配失败!");
}

Important to note that the parameter Matcher.group (index) by Method 1 for the first substring, 2 denotes a second sub-string. What if we pass 0'll get it? The answer is 010-12345678, regular expression code that is matched to the entire regular strings are used in the preceding code is String.matches () method, which we used in the code is extracted packet java.util. regex package inside Pattern Matcher class and classes. In fact the two codes are essentially the same, as is the way Pattern Matcher class and internal String.matches () method call.

But repeated use String.matches () many times less efficient match for a regular expression with, because every time create the same Pattern object. Can create a Pattern object, then repeated use, it can be achieved compiled once, multiple matches:

Pattern pattern = Pattern.compile("(\\d{3,4})\\-(\\d{7,8})");
pattern.matcher("010-12345678").matches(); // true
pattern.matcher("021-123456").matches(); // true
pattern.matcher("022#1234567").matches(); // false
// 获得Matcher对象:
Matcher matcher = pattern.matcher("010-12345678");
if (matcher.matches()) {
    String whole = matcher.group(0); // "010-12345678", 0表示匹配的整个字符串
    String area = matcher.group(1); // "010", 1表示匹配的第1个子串
    String tel = matcher.group(2); // "12345678", 2表示匹配的第2个子串
    System.out.println(area);
    System.out.println(tel);
}

Non-greedy match:

1230000 string matching e.g., I want to intercept the front 123 and back piece are two strings of 0:

If "(\\ d +) (0 *)" to match will be in the \ d When complete matching 1230000 all match, leading to the back of the substring is "." This is the greedy match. Use +? To complete a non-greedy match.

Pattern pattern = Pattern.compile("(\\d+?)(0*)");
Matcher matcher = pattern.matcher("1230000");
if (matcher.matches()) {
    System.out.println("group1=" + matcher.group(1)); // "123"
    System.out.println("group2=" + matcher.group(2)); // "0000"
}

Fifth, the Search and Replace

Regular expression matching can be completed and the replacement string.

(1) dividing the string

"a b c".split("\\s"); // { "a", "b", "c" }
"a b  c".split("\\s"); // { "a", "b", "", "c" }
"a, b ;; c".split("[\\,\\;\\s]+"); // { "a", "b", "c" }

(2) search string

String s = "the quick brown fox jumps over the lazy dog.";
Pattern p = Pattern.compile("\\wo\\w");
Matcher m = p.matcher(s);
while (m.find()) {
    String sub = s.substring(m.start(), m.end());
    System.out.println(sub);
}

(3) replacement string

String s = "The     quick\t\t brown   fox  jumps   over the  lazy dog.";
String r = s.replaceAll("\\s+", " ");
System.out.println(r); // "The quick brown fox jumps over the lazy dog."

(4) back references

 String s = "the quick brown fox jumps over the lazy dog.";
String r = s.replaceAll("\\s([a-z]{4})\\s", " <b>$1</b> ");
System.out.println(r);

发布了134 篇原创文章 · 获赞 91 · 访问量 16万+

Guess you like

Origin blog.csdn.net/weixin_44588495/article/details/104081047