Regular Expressions (Summary)

1. Concept

Regular expressions, also known as regular expressions. (English: Regular Expression, often abbreviated as regex, regexp or RE in the code), a concept of computer science. Regular expressions are usually used to retrieve and replace text that meets a certain pattern (rule). Regular expressions can be supported in many languages, such as Perl, PHP, Java, Python, Ruby, etc. Of course, in Java, you can also achieve the purpose of searching and replacing text strings by processing strings, but it is more concise to write code with regular expressions. Usually two or three lines of code can achieve the goal. Of course, this is also based on familiarity with regular expressions. On the basis of expressions.

2. Basic grammar

1. What is regular expression?
Regular expression is a formula that uses a certain pattern to match a type of string. Learning regular expressions means learning how to define a "pattern" grammar. To put it bluntly, it means learning various matching rules, such as how to write matching numbers, how to write matching characters, and so on.
2. Imported package
The classes that call regular expressions in java are java.util.regex.Matcher and java.util.regex.Pattern. The java.util.regex package is provided from jdk1.4. There are many ways to write regular expressions.
Only matching
method 1:

import java.awt.peer.PanelPeer;
import java.lang.reflect.Array;
import java.util.regex.Pattern;

public class Demo7 {
    
    

	public static void main(String[] args) {
    
    
		//要匹配的字符
		String string="8";
		//正则表达式(匹配模式)
		String regex="[0-9]";//匹配原理解析:表示匹配0~9的任意一个数,后面会具体讲解
		//返回匹配的结果,匹配成功就返回true,失败就返回false,此次匹配返回true。
		boolean flag=Pattern.matches(regex, string);
		System.out.println(flag);//true
	}

}

Key code : boolean flag=Pattern.matches(regex, string);

Method 2:

import java.awt.peer.PanelPeer;
import java.lang.reflect.Array;
import java.util.regex.Pattern;

public class Demo7 {
    
    

	public static void main(String[] args) {
    
    
		//要匹配的字符
		String string="A";
		//正则表达式(匹配模式)
		String regex="[0-9]";//匹配原理解析:表示匹配0~9的任意一个数,后面会具体讲解
		//返回匹配的结果,匹配成功就返回true,失败就返回false,此次匹配返回true。
		System.out.println(string.matches(regex));//false
	}
}

Key code: string.matches(regex)

Generally, method 2 is commonly used, just choose one of them according to personal preference.

Three, regular grammar

1. Commonly used metacharacters Commonly used metacharacters in
regular expressions are as follows:
Insert picture description here
2. Connecting characters
In regular expressions, it is very inconvenient to write matching numbers or English letters. Therefore, the regular expression introduces the concatenation "-" to define the range of characters.
Insert picture description here
3. Qualifier
Qualifier is to limit the number of occurrences of a certain character or a certain type of character.
Commonly used regular expression qualifiers are as follows:
Insert picture description here
4. Locator
In regular expression, locator, to put it plainly, is to limit the position of certain characters.
Commonly used regular expression locators are as follows:
Insert picture description here
5. Escape characters
1) Escape characters in JavaScript The
regular expression itself has its own set of escape characters. The escape characters in regular expressions are not the same as those in JavaScript. They are just the same in terms of conceptual understanding. You have to distinguish them.
Insert picture description here
2) Escape characters in
regular expressions We all know that regular expressions include two kinds of characters: (1) ordinary characters; (2) special characters. If we want to match a special character in a regular expression, we must add a backslash "\" in front of the special character to escape it.
Example:
go+
analysis:
because + is a special symbol of regular expressions, it must be escaped by adding "\" in front of +.

For example, if you want to match the literal "\", you need to use "\".

The characters that need to be escaped are: $ , ( , ) , *, + , . , [ , ] , ? , \, / , ^ , { , } , | .

6. Grouping
Grouping is also called sub-expression, which divides all or part of a regular expression into one or more groups. Among them, the characters used in grouping are "(" and ")", that is, the left parenthesis and the right parenthesis. After grouping, the expression enclosed in parentheses can be processed as a whole.

7. Selector It is
very simple to select matching characters. In regular expressions, the selection match symbol is "|", which is used to select any one of the two options, similar to the "or" operation in JavaScript.

For example, "abc|def1" matches "abc" or "def1", but not "abc1" or "def1". If you want to match "abc1" or "def1", you should use the grouping symbol, that is, "(abcd|efgh)1".

8. Priority order The
Insert picture description here
above priority is arranged from high to low.

Fourth, the method

(1) Matching matches()
has already used the matching method before, so I won't give an example here.


Key code : boolean flag=Pattern.matches(regex, string);
Key code: string.matches(regex)

(2) Replace replaceAll()

public class Demo7 {
    
    
	public static void main(String[] args) {
    
    
		//要匹配的字符串
		String string="45cf45dG892Df";//有数字、字母大小写组成
		//正则表达式(匹配模式)
		String regex="[a-zA-Z]+";
		//正则表达式(匹配模式)
		String regex2="\\d+";
		//将字符串中英文字母替换为&符号,输出结果为:45&45&892&
		System.out.println(string.replaceAll(regex, "&"));
		//将字符串中单个数字或者连续的数字替换为0,输出结果为:0cf0dG0Df
		System.out.println(string.replaceAll(regex2, "0"));
	}
}

(3) Cutting, cutting the string according to uppercase letters. split()

public class Demo7 {
    
    

	public static void main(String[] args) {
    
    
		//要匹配的字符串
		String string="45cf45dG892Df";//有数字、字母大小写组成
		//正则表达式(匹配模式)
		String regex="[A-Z]";
		//根据大写字母切割字符串
		String[] arr=string.split(regex);
		for(String s:arr) {
    
    
			System.out.println(s);
		}
		//输出结果为:
		//45cf45d
		//892
		//f
	}
}

Five, summary

  1. Any one character means to match any corresponding character, such as a matches a, 7 matches 7, and -matches.

  2. [] means matching any one of the characters in the brackets, such as [abc] matching a or b or c.

  3. -The inside and outside of the brackets have different meanings. For example, when it is outside, it will match -. If inside the brackets [ab] means to match any of the 26 lowercase letters; [a-zA-Z] matches a total of 52 uppercase and lowercase letters Any one of the letters; [0-9] matches any one of the ten digits.

  4. The meaning inside and outside the square brackets is different. If it is outside, it means the beginning. For example, 7[0-9] means that the match starts with 7, and the second digit is a string of any number; if it is inside the square brackets, it means Any character other than this character (including numbers, special characters), such as [^abc] means to match any character other than abc.

  5. . Means to match any character.

  6. \d means number.

  7. \D means not a number.

  8. \s means composed of empty characters, [\t\n\r\x\f].

  9. \S means composed of non-blank characters, [^\s].

  10. \w represents letters, numbers, and underscores, [a-zA-Z0-9_].

  11. \W means it is not composed of letters, numbers, and underscores.

  12. ?: Indicates 0 or 1 occurrence.

  13. + Means one or more occurrences.

  14. * Means 0, 1 or more occurrences.

  15. {n} means n occurrences.

  16. {n,m} means n~m occurrences.

  17. {n,} means that it appears n times or more.

  18. XY means X is followed by Y, where X and Y are part of the regular expression.

  19. X|Y means X or Y. For example, "food|f" matches foo (d or f), and "(food)|f" matches food or f.

  20. (X) Sub-expression, treat X as a whole.

Six, concluding remarks

In fact, regular expressions are nothing more than that. She is not as difficult as she thought, but after studying systematically, she feels that she is actually very simple.
Forgot to remember to come back and see her~
Bye bye~~

Guess you like

Origin blog.csdn.net/zhangzhanbin/article/details/110880117