JAVA Android regular expressions

regular expression

Regular expressions are techniques for performing pattern matching on strings.

Regular expression matching process

    private void RegTheory() {
    
    
        // 正则表达式
        String content = "1998年12月8日,第二代Java平台的企业版J2EE发布。1999年6月,Sun公司发布了第二代Java平台(简称为Java2) " +
                "的3个版本: J2ME (Java2 Micro Edition,Java2平台的微型版),应用于移动、无线及有限资源的环境: J2SE (Java  " +
                "Standard Edition,Java 2平台的标准版),应用于桌面环境:2EE (Java 2Enterprise Edition,Java " +
                "2平台的企业版) ,应用3443于基于Java的应用服务器。Java 2平台的发布,是Java发展过程中最重要的一个里程碑,标志着Java的应用开始普及9889";
        // 目标 匹配所有数组的字符串
        // 1、 \\d表示一个任意的数字
        String regStr = "\\d\\d\\d\\d";
        // 2、创建模式对象
        Pattern pattern = Pattern.compile(regStr);
        // 3、创建匹配器
        // 说明,创建匹配器,按照正则表达式的规则去匹配
        Matcher matcher = pattern.matcher(content);

        while (matcher.find()){
    
    
            Log.i(TAG,"找到:" + matcher.group(0));
        }
    }
2023-08-08 16:09:28.168 29172-29172/cn.jj.reg I/JJWorld.MainActivity: 找到:1998
2023-08-08 16:09:28.168 29172-29172/cn.jj.reg I/JJWorld.MainActivity: 找到:1999
2023-08-08 16:09:28.168 29172-29172/cn.jj.reg I/JJWorld.MainActivity: 找到:3443
2023-08-08 16:09:28.168 29172-29172/cn.jj.reg I/JJWorld.MainActivity: 找到:9889
private void RegTheory() {
    
    
        // 正则表达式
        String content = "1998年12月8日,第二代Java平台的企业版J2EE发布。1999年6月,Sun公司发布了第二代Java平台(简称为Java2) " +
                "的3个版本: J2ME (Java2 Micro Edition,Java2平台的微型版),应用于移动、无线及有限资源的环境: J2SE (Java  " +
                "Standard Edition,Java 2平台的标准版),应用于桌面环境:2EE (Java 2Enterprise Edition,Java " +
                "2平台的企业版) ,应用3443于基于Java的应用服务器。Java 2平台的发布,是Java发展过程中最重要的一个里程碑,标志着Java的应用开始普及9889";
        // 目标 匹配所有数组的字符串
        // 1、 \\d表示一个任意的数字
        String regStr = "(\\d\\d)(\\d\\d)";
        // 2、创建模式对象
        Pattern pattern = Pattern.compile(regStr);
        // 3、创建匹配器
        // 说明,创建匹配器,按照正则表达式的规则去匹配
        Matcher matcher = pattern.matcher(content);

        /**
         * matcher.find()
         * 1、根据指定的规则,定位满足规则的子字符串(比如1998)
         * 2、找到后,将 子字符串的开始的索引记录到 matcher对象的属性 int[] groups[0] = 0
         *      把该子字符串的结束的索引+1的值记录到 groups[1] = 4
         *
         * 3、如果再次指向 find方法。仍然安上面分析来执行
         *
         * matcher.find()  考虑分组
         * // 正则表达式中,包括小括号则表示分组
         * 第一个小括号表示第一组 第二个小括号表示第二组  (\d\d)(\d\d)
         *
         * 1、根据指定的规则,定位满足规则的子字符串(比如1998)
         * 2、找到后,将 子字符串的开始的索引记录到 matcher对象的属性 int[] groups;
         *      2.1 groups[0] = 0 ,把该子字符的结束的索引+1的值记录到 groups[1] = 4
         *      2.2 记录1组()匹配到的字符串 groups[2] = 0 groups[3] = 2
         *      2.3 记录2组()匹配到的字符串 groups[4] = 2 groups[5] = 4
         *      2.4 如果有更多的分组,依次类推
         *
         * 3、如果再次指向 find方法。仍然安上面分析来执行
         */
        while (matcher.find()){
    
    
            Log.i(TAG,"找到:" + matcher.group(0));
        }
    }

regular expression syntax

If you want to use regular expressions flexibly, you must understand the functions of various metacharacters. Metacharacters are roughly divided into functions:

1. Qualifier
2. Selection matcher
3. Group combination and back reference
4. Special character
5. Character matcher
6. Locator

Description of escape characters

Metacharacter (Metacharacter)-escape symbol \\
Symbol Description:
When we use regular expressions to retrieve some special characters, we need to use the escape symbol, otherwise the result will not be retrieved, and an error will even be reported.
Example: What happens when you use to match? " What happens when you use abc to match?" abcWhat happens if you match it ? " ab c ("
What happens when you use ( to match "abc$("?
insert image description here

    private void RegTest2() {
    
    
        String content ="abc$(abc(1.23(";
        //匹配(
//        String regStr = "\\(";
        String regStr = "\\.";
        Pattern pattern = Pattern.compile(regStr);
        Matcher matcher = pattern.matcher(content);
        while (matcher.find()){
    
    
            Log.i(TAG,"match:" + matcher.group(0));
        }
    }

metacharacter-character matcher

insert image description here
insert image description here

Correction: \\w matches single digits, uppercase and lowercase alphabetic characters, and underscores _ Equivalent to [0-9a-zA-Z_]
? Represents 0 or 1
{3} represents three numbers
+ represents 1 to more
\\s matches any blank character (space, tab, etc.
\\S matches any non-empty character
. Matches all characters except \n, If you want to match . itself you need to use \\..

qualifier

insert image description here
insert image description here

private void regSMatching(){
    
    
        String content = "1111222 z_*@ hello jj 曙光";
        String regStr = "\\d{3,4}"; // 匹配任何非空字符
        Pattern pattern = Pattern.compile(regStr,Pattern.CASE_INSENSITIVE);
        Matcher matcher = pattern.matcher(content);
        while (matcher.find()){
    
    
            Log.i(TAG,"matcher:" + matcher.group(0));
        }
    }

When java matches, it is greedy and matches more by default.

2023-08-16 20:45:38.948 9275-10989/cn.jj.reg I/JJWorld.MainActivity: matcher:1111
2023-08-16 20:45:38.948 9275-10989/cn.jj.reg I/JJWorld.MainActivity: matcher:222

not case sensitive

modify string

(?i) means case insensitive Example: String regStr = “(?i)abc”;

Modify the generated Pattern rules

Pattern pattern = Pattern.compile(regStr,Pattern.CASE_INSENSITIVE);

locator

insert image description here

\- indicates that a -

    public static void regT(){
    
    
        String content = "abc123-abc123";
//        String regStr = "[0-9]+[a-z]*";
        String regStr = "[0-9]+\\-[a-z]+";
        Pattern pattern = Pattern.compile(regStr);
        Matcher matcher = pattern.matcher(content);
        if (matcher.find()){
    
    
            Log.i(TAG,"reg:" + matcher.group(0));
        }
    }
2023-08-17 20:01:29.666 26962-27135/cn.jj.reg I/JJWorld.MainActivity: reg:123-abc
    public static void regT(){
    
    
        String content = "abc123aabcdabcd 123 abcd assa";
//        String regStr = "[0-9]+[a-z]*";
//        String regStr = "[0-9]+\\-[a-z]+";
        String regStr = "abc\\B";
        Pattern pattern = Pattern.compile(regStr);
        Matcher matcher = pattern.matcher(content);
        while (matcher.find()){
    
    
            Log.i(TAG,"reg:" + matcher.group(0));
        }
    }
2023-08-17 20:14:10.771 29568-29740/cn.jj.reg I/JJWorld.MainActivity: reg:abc
2023-08-17 20:14:10.771 29568-29740/cn.jj.reg I/JJWorld.MainActivity: reg:abc
2023-08-17 20:14:10.771 29568-29740/cn.jj.reg I/JJWorld.MainActivity: reg:abc
2023-08-17 20:14:10.771 29568-29740/cn.jj.reg I/JJWorld.MainActivity: reg:abc

\s

\s is a special escape character that matches any blank character, including spaces, tabs, newlines, etc.

Grouping is important

insert image description here

    public void regGroup(){
    
    
        String content = "hanshunping s7789 nn1189han";
        String regStr = "(?<g1>\\d\\d)(?<g2>\\d\\d)";
        Pattern pattern = Pattern.compile(regStr);
        Matcher matcher = pattern.matcher(content);
        while (matcher.find()){
    
    
            Log.i(TAG,"reg:" + matcher.group(0) + "-" + matcher.group(1) + "-" + matcher.group(2) + " g1 " + matcher.group("g1"));
        }
    }
2023-08-17 20:44:46.911 31878-32267/cn.jj.reg I/JJWorld.MainActivity: reg:7789-77-89 g1 77
2023-08-17 20:44:46.911 31878-32267/cn.jj.reg I/JJWorld.MainActivity: reg:1189-11-89 g1 11

Guess you like

Origin blog.csdn.net/qq_42015021/article/details/132169039