Android learn regular expressions

Java regular expression learning: 
because the regular expression is a very complex system, to name just a few examples of this concept of entry, more please refer to the relevant books and explore on their own. 

// backslash 
/ t interval ( '/ u0009') 
/ n-line feed ( '/ u000A') 
/ R & lt carriage return ( '/ u000D') 
/ D digital equivalent to [0-9] 
/ D non-digital equivalent of to [^ 0-9] 
/ white symbol S [/ T / n-/ x0B / F / R & lt] 
/ non-blank symbol S [^ / T / n-/ x0B / F / R & lt] 
/ W single character [a-zA- Z ~ 0-9] 
/ W is a non-single character [^ a-zA-Z ~ 0-9] 
/ feed character F 
/ E the Escape 
/ a word boundary B 
/ B a non-word boundary 
before / G a matching end 

^ as limiting beginning 
^ java conditions begin to Java-character 
$ as limiting end 
java $ java conditions of ending characters in 
. addition to the conditions / n at any single character a 
java .. conditions after two java any change except line characters 


include specific restrictions "[]" 
[az] conditions in lowercase a to z in a range of characters 
[AZ] conditions in upper case a to z in a range of characters 
[a-zA-Z] conditions a to z in lowercase or uppercase A to Z in the range of a character 
[0-9] conditions in the range 0 to 9 in lowercase character 
[0-9a-z] conditions in lowercase 0 to 9 or in a range a to z character 
[0-9 [az]] conditions in lowercase 0 to 9 or in a range a to z character (intersection) 

was added in [] ^ restrictions again after adding '[^ ] " 
[^ az] conditions in a non-lowercase a to z range of a character 
[^ AZ] conditions in a non-uppercase a to a character Z range 
[^ a-zA-Z] conditions in a non-lowercase a to z or uppercase a to Z in a range of characters 
[^ 0-9] conditions in a non-lowercase range of 0 to 9, a character 
[^ 0-9a-z] conditions in a non-lowercase 0 to 9 or in a range a to z character 
[^ 0-9 [az]] conditions in the non-lowercase 0 to 9 or a to z in a range of characters (the intersection) 

above 0 appearances for a particular character in restrictions, you can use "*" 
J * 0 Ge more J 
. * 0 or more arbitrary characters 
J. * zero or more arbitrary characters between the D and J D 

 
When restrictions occur more than once for a specific character, use the "+" 
J + more than 1 J 
. + More than an arbitrary character 
J. + between D J and D above an arbitrary character 

appears there is a specific character restrictions above 0 or 1, can use the "?" 
JA? J or JA appears 

restricted to consecutive occurrences specified number of characters "{a} ' 
J {2} JJ 
J {. 3} JJJ 
text above a th, and" {a, } ' 
J {3,} JJJ, JJJJ , jJJJJ, ??? ( more than 3 times coexist J) 
b less "b words or more, {a,}' 
J {3,5} JJJ JJJJ or jjjjj or 
both take a "|" 
J | a J or a 
the Java | or the Java Hello Hello 

"()" in the provisions of a combination of type 
, for example, I check <a href=/"index.html/"> index </a> in <a href > data between </a>, can be written as <a. * href = / "  . * /"> (. +?) </a>

in use Pattern.compile function, may be added to control the behavior of the regular expression matching parameters: 
the Pattern of Pattern.compile (String REGEX, int in Flag) 

in Flag of the following ranges: 
Pattern.CANON_EQ if and only if the case of "regular decomposition (canonical decomposition)" two characters are exactly the same, it finds a match. For example, after using this flag, the expression "a / u030A" will match "?." By default, not considered "canonical equivalence (canonical equivalence)". 
Under Pattern.CASE_INSENSITIVE (? I) By default, unknown flu case matching applies only to US-ASCII character set. This flag allows expression match ignoring case. To Unicode characters a sense of unknown size match, as long as UNICODE_CASE with this flag together on the line. 
Pattern.COMMENTS (? X) In this mode, ignores (regular expressions) of space characters (Translator's Note match: No, not expressions of "// s", but refers to expressions of space, tab, carriage return and the like). # Comments from the beginning until the end of the line. Unix-line mode can be enabled via the embedded flag. 
Pattern.DOTALL (? S) In this mode, the expression '' matches any character, including a line terminator represented. By default, the expression '' does not match the line terminators.
Pattern.MULTILINE 
(? M) In this mode, '^' and '$', respectively match the beginning and end of a line. In addition, '^' still matches the beginning of the string, '$' also matches the end of the string. By default, these two expressions only match the beginning and end of the string. 
Pattern. 
UNICODE_CASE (? U) In this mode, if you also enabled CASE_INSENSITIVE flag, then it will Unicode characters unknown flu case matching. By default, case-insensitive match applies only to US-ASCII character set. 
Pattern.UNIX_LINES (? D) In ​​this mode, only the '/ n' suspension was only recognized as a line, and matched with '.', '^', And '$'.

 

 

 

 

 

 

 

抛开空泛的概念,下面写出几个简单的Java正则用例: 
◆比如,在字符串包含验证时 
//查找以Java开头,任意结尾的字符串 
Pattern pattern = Pattern.compile("^Java.*"); 
Matcher matcher = pattern.matcher("Java不是人"); 
boolean b= matcher.matches(); 
//当条件满足时,将返回true,否则返回false 
System.out.println(b); 

◆以多条件分割字符串时 
Pattern pattern = Pattern.compile("[, |]+"); 
String[] strs = pattern.split("Java Hello World Java,Hello,,World|SUN"); 
for (int i=0;i<strs.length;i++) { 
    System.out.println(strs[i]); 
} 
◆文字替换(首次出现字符) 
Pattern pattern = Pattern.compile("正则表达式"); 
Matcher matcher = pattern.matcher("正则表达式 Hello World,正则表达式 Hello World"); 
//替换第一个符合正则的数据 
System.out.println(matcher.replaceFirst("Java")); 
◆文字替换(全部) 
Pattern pattern = Pattern.compile("正则表达式"); 
Matcher matcher = pattern.matcher("正则表达式 Hello World,正则表达式 Hello World"); 
//替换第一个符合正则的数据 
System.out.println(matcher.replaceAll("Java")); 

◆文字替换(置换字符) 
Pattern pattern = Pattern.compile("正则表达式"); 
Matcher matcher = pattern.matcher("正则表达式 Hello World,正则表达式 Hello World "); 
StringBuffer sbr = new StringBuffer(); 
while (matcher.find()) { 
    matcher.appendReplacement(sbr, "Java"); 
} 
matcher.appendTail(sbr); 
System.out.println(sbr.toString()); 
◆验证是否为邮箱地址 
String str="[email protected]"; 
Pattern pattern = Pattern.compile("[//w//.//-]+@([//w//-]+//.)+[//w//-]+",Pattern.CASE_INSENSITIVE); 
Matcher matcher = pattern.matcher(str); 
System.out.println(matcher.matches()); 
◆去除html标记 
Pattern pattern = Pattern.compile("<.+?>", Pattern.DOTALL); 
Matcher matcher = pattern.matcher("<a href="/" mce_href="/""index.html/">主页</a>"); 
String string = matcher.replaceAll(""); 
System.out.println(string); 
◆查找html中对应条件字符串 
Pattern pattern = Pattern.compile("href=/"(.+?)/""); 
Matcher matcher = pattern.matcher("<a href="/" mce_href="/""index.html/">主页</a>"); 
if(matcher.find()) 
System.out.println(matcher.group(1)); 
} 
◆截取http://地址 
//截取url 
Pattern pattern = Pattern.compile("(http://|https://){1}[//w//.//-/:]+"); 
Matcher matcher = pattern.matcher("dsdsds<http://dsds//gfgffdfd>fdf"); 
StringBuffer buffer = new StringBuffer(); 
while(matcher.find()){              
    buffer.append(matcher.group());        
    buffer.append("/r/n");              
System.out.println(buffer.toString()); 
} 
        
◆替换指定{}中文字 
String str = "Java目前的发展史是由{0}年-{1}年"; 
String[][] object={new String[]{"//{0//}","1995"},new String[]{"//{1//}","2007"}}; 
System.out.println(replace(str,object)); 
public static String replace(final String sourceString,Object[] object) { 
            String temp=sourceString;    
            for(int i=0;i<object.length;i++){ 
                      String[] result=(String[])object[i]; 
               Pattern    pattern = Pattern.compile(result[0]); 
               Matcher matcher = pattern.matcher(temp); 
               temp=matcher.replaceAll(result[1]); 
            } 
            return temp; 
} 

◆以正则条件查询指定目录下文件 
//用于缓存文件列表 
        private ArrayList files = new ArrayList(); 
        //用于承载文件路径 
        private String _path; 
        //用于承载未合并的正则公式 
        private String _regexp; 
        
        class MyFileFilter implements FileFilter { 
            /** 
               * 匹配文件名称 
               */ 
            public boolean accept(File file) { 
                try { 
                  Pattern pattern = Pattern.compile(_regexp); 
                  Matcher match = pattern.matcher(file.getName());                
                  return match.matches(); 
                } catch (Exception e) { 
                  return true; 
                } 
            } 
            } 
        
        /** 
        * 解析输入流 
        * @param inputs 
        */ 
        FilesAnalyze (String path,String regexp){ 
            getFileName(path,regexp); 
        } 
        
        /** 
        * 分析文件名并加入files 
        * @param input 
        */ 
        private void getFileName(String path,String regexp) { 
            //目录 
              _path=path; 
              _regexp=regexp; 
            File directory = new File(_path); 
            File[] filesFile = directory.listFiles(new MyFileFilter()); 
            if (filesFile == null) return; 
            for (int j = 0; j < filesFile.length; j++) { 
                files.add(filesFile[j]); 
            } 
            return; 
            } 
    
        /** 
         * 显示输出信息 
         * @param out 
         */ 
        public void print (PrintStream out) { 
            Iterator elements = files.iterator(); 
            while (elements.hasNext()) { 
                File file=(File) elements.next(); 
                    out.println(file.getPath());    
            } 
        } 
        public static void output(String path,String regexp) { 
            FilesAnalyze fileGroup1 = new FilesAnalyze(path,regexp); 
            fileGroup1.print(System.out); 
        } 
    
        public static void main (String[] args) { 
            output("C://","[A-z|.]*"); 
        }

 

 

Reproduced in: https: //my.oschina.net/lendylongli/blog/226792

Guess you like

Origin blog.csdn.net/weixin_34050427/article/details/92576500