regular expression
concept
- Definition: an expression that conforms to certain rules
- Role: used to specifically manipulate strings
Features: used for some specific symbols to represent the operation of the code, which simplifies the long program code
- Benefit: Can simplify complex operations on strings
Disadvantages: The more symbol definitions, the longer the regularity and the worse the readability
key class
- String
- Pattern: Regular class object
- Matcher: An engine class that performs matching operations on character sequences by interpreting Patterns. (matcher)
Specific operation function
match: Returns a boolean value that matches the rule
1 String:
boolean string.matches(String regex)
2 Pattern + Matcher:
Pattern pattern.compile(String regex)
Matcher pattern.matcher(String regex)
boolean matcher.matches()
3 Pattern
boolean b = Pattern.matches(String regex, String string)
Cut: return a string that does not match the rules
1 String:String[] split(String regex)
2 String:String[] split(String regex, int limit)
> 为了让规则的结果被重用,可以让规则封装成一个组,用()完成。组的出现都有编号:
从1开始想要使用已有的组可以通过格式:\ + 组序号
Eg:以叠词作为分隔符号,切隔字符串:(.)\1+
Eg: (X(Y(Z)))(M)
组号:1 2 3 4
replace: returns the string replaced by the rule
1 String:String replaceAll(String regex, String replacement):regex-\n
2 String:String replaceFirst(String regex, String replacement):replacement-$n
3 Pattern + Matcher:
Pattern pattern.compile(String regex)
Matcher pattern.matcher(String regex)
-------------------------------------
String matcher.quoteReplacement(String s)
String matcher.replaceAll(String replacement)
String replaceFirst(String replacement)
Get: Get the substring of the rule in the string
将正则表达式封装成对象;让正则表达式和要操作的字符串相关联;关联后,获取正则匹配引擎;通过引擎对符合规则的子串进行操作,如:取出
1 //step1:将规则封装成对象
Pattern pattern = Pattern.compile(regex);
//step2:让正则对象和要提取的字符串相关联,获取匹配(器)引擎对象
Matcher matcher = pattern.matcher(string);
while(matcher.find()) {//不断地利用正则引擎查找符合正则表达式的对象,找到:true;没有找到:false
System.out.println("start:" + matcher.start() + " word:" + matcher.group() + " end:" + matcher.end());//返回当前查找到的一个子串,并返回其单词的开始start()与结束区间end()
}