character |
Explanation |
---|---|
\ |
The next character is marked as a special character, text, back-references or octal escape. For example, "n" matches the character "n". "\ N" matches a newline character. Sequence "\\" matches "\", "\\ (" match "(." |
^ |
Matches the input string starting position. If the set RegExp object Multiline properties, and also the position after the ^ "\ n" or "\ r" match. |
$ |
Matches the position of the input end of the string. If you set RegExp object's Multiline property, and also the position before the $ "\ n" or "\ r" match. |
* |
Zero or more times matches the preceding character or sub-expression. For example, zo * matches "z" and "zoo". * Equivalent to {0}. |
+ |
One or more times to match the preceding character or sub-expression. For example, "zo +" and "zo" and "zoo" match, but does not match the "z". + Is equivalent to {1}. |
? |
Zero or one matches the preceding character or sub-expression. For example, "do (es)?" Matches the "do" or "does" in the "do". ? Is equivalent to {0,1}. |
{n} |
n is a nonnegative integer. Exactly matching n times. For example, "o {2}" and "Bob" in the "o" does not match, but the two "food" in the "o" match. |
{n,} |
n is a nonnegative integer. Matching at least n times. For example, "o {2,}" mismatch "Bob" in the "o", and match all o "foooood" in. "o {1,}" is equivalent to "o +". "o {0,}" is equivalent to "o *". |
{n,m} |
M and n are nonnegative integers, where n <= m . Matching at least n times at most m times. For example, "o {1,3}" matching "fooooood" in the first three o. 'o {0,1}' is equivalent to 'o?'. Note: You can not insert spaces between commas and numbers. |
? |
When this character immediately any other qualifiers (*, +,?, { N- }, { n- ,}, { n- , m }) Thereafter, when the pattern matching is "non-greedy." "Non-greedy" pattern matching to search for possible short string, and the default "greedy" pattern matching to search for possible long string. For example, the string "oooo" in, "o +?" Matches only a single "o", and "o +" match all "o". |
. |
Match any single character except "\ r \ n" is. To match any character including "\ r \ n", including the use mode such as "[\ s \ S]" or the like. |
(pattern) |
Matching pattern and capture subexpression of the match. You can use $ 0 ... $ 9 properties result from "matching" to retrieve the set of matching the captured. To match parentheses characters (), use "\ (" or "\)." |
(?:pattern) |
Matching pattern sub-expression but does not capture the match, that it is a non-capturing match, not to store for later use in the match. This use "or" character | mode is useful when the combination of components (). For example, 'industr (:? Y | ies) is a ratio of' industry | expression more economical industries'. |
(?=pattern) |
Performing forward prediction subexpression first search, the expression matches in a match pattern string string starting point. It is a non-capturing match, that does not capture the match for later use. For example, 'Windows (= 95 |? 98 | NT | 2000)' match "Windows 2000" in the "Windows", but does not match the "Windows 3.1" "Windows". Lookahead do not take character, that is, after a match occurs, the next match for your search immediately after the previous match, rather than the composition of the prediction after the first character. |
(?!pattern) |
Subexpression perform a reverse lookahead search, which matches the expression is not in the match pattern starting point string search string. It is a non-capturing match, that does not capture the match for later use. For example, 'Windows (95 |?! 98 | NT | 2000)' Matching "Windows 3.1" "Windows", but does not match the "Windows 2000" in the "Windows". Lookahead do not take character, that is, after a match occurs, the next match for your search immediately after the previous match, rather than the composition of the prediction after the first character. |
x | Y |
Match x or Y . For example, 'z | food' match "z" or "food". '(z | f) ood' match "zood" or "food". |
[xyz] |
character set. Matches any character included. For example, "[abc]" matches "plain" in "a". |
[^xyz] |
Reverse character set. Matches any character not included. For example, "[^ abc]" matches "plain" in "p", "l", "i", "n". |
[a-z] |
Range of characters. Matches any character within the specified range. For example, "[az]" matches "a" to any of the lowercase letters "z" range. |
[^a-z] |
Reverse range of characters. Matches any character not within the specified range. For example, "[^ az]" not match any "a" to any character in the "z" range. |
\b |
Matches a word boundary, that is, the position between a word and a space. For example, "er \ b" match "never" in "er", but does not match the "verb" in "er". |
\B |
Non-word boundary matching. "Er \ B" matches "verb" in "er", but does not match "never" in "er". |
\cx |
Match x control character indicated. For example, \ cM matching Control-M or carriage return. x values must be between AZ or az. If not, it is assumed that c is "c" character itself. |
\d |
Numeric characters match. It is equivalent to [0-9]. |
\D |
Non-numeric characters match. It is equivalent to [^ 0-9]. |
\f |
In other matching page break. Equivalent to \ x0c and \ cL. |
\n |
Newline match. Equivalent to \ x0a and \ cJ. |
\r |
Matching a carriage return. Equivalent to \ x0d and \ cM. |
\s |
Matches any whitespace characters, including spaces, tabs, page breaks and so on. And [\ f \ n \ r \ t \ v] equivalent. |
\S |
Matches any non-whitespace characters. And [^ \ f \ n \ r \ t \ v] equivalent. |
\t |
Tabs match. And \ x09 and \ cI equivalent. |
\ v |
Vertical tabs match. And \ x0b and \ cK equivalent. |
\w |
Character class matches any character, including underscore. And "[A-Za-z0-9_]" equivalent. |
\W |
Matches any non-word character. And "[^ A-Za-z0-9_]" equivalent. |
\xn |
Matching n , here n is a hexadecimal escape code. Hexadecimal escape code must be exactly two digits long. For example, "\ x41" matching "A". "\ x041" and "\ x04" & "1" equivalent. It allows the use of ASCII codes in the regular expression. |
\ a |
匹配 num,此处的 num 是一个正整数。到捕获匹配的反向引用。例如,"(.)\1"匹配两个连续的相同字符。 |
\n |
标识一个八进制转义码或反向引用。如果 \n 前面至少有 n 个捕获子表达式,那么 n 是反向引用。否则,如果 n是八进制数 (0-7),那么 n 是八进制转义码。 |
\nm |
标识一个八进制转义码或反向引用。如果 \nm 前面至少有 nm 个捕获子表达式,那么 nm 是反向引用。如果 \nm前面至少有 n 个捕获,则 n 是反向引用,后面跟有字符 m。如果两种前面的情况都不存在,则 \nm 匹配八进制值nm,其中 n 和 m 是八进制数字 (0-7)。 |
\nml |
当 n 是八进制数 (0-3),m 和 l 是八进制数 (0-7) 时,匹配八进制转义码 nml。 |
\un |
匹配 n,其中 n 是以四位十六进制数表示的 Unicode 字符。例如,\u00A9 匹配版权符号 (©)。 |
package niuke;
import java.util.regex.Pattern;
public class MeituanTest1 {
public static void main(String[] args) {
// test1();
// test2();
// test3();
// test4();
// test5();
// test6();
// test7();
// test8();
test9();
}
//字符串中\代表转义,在正在表达式中\\相当于字符串中的一个\
public static void test1(){
String str="\\";
// String patternStr="^x\\w*@tal\\w*\\.\\w*";
String patternStr="\\\\";
boolean result = Pattern.matches(patternStr, str);
if (result) {
System.out.println("字符串"+str+"匹配模式"+patternStr+"成功");
}
else{
System.out.println("字符串"+str+"匹配模式"+patternStr+"失败");
}
}
//正则式是最简单的能准确匹配一个给定String的模式,
// 模式与要匹配的文本是等价的.静态的Pattern.matches方法
// 用于比较一个String是否匹配一个给定模式.例程如下:
public static void test2(){
String str="java";
String patternStr="java";
boolean result = Pattern.matches(patternStr, str);
if (result) {
System.out.println("字符串"+str+"匹配模式"+patternStr+"成功");
}
else{
System.out.println("字符串"+str+"匹配模式"+patternStr+"失败");
}
}
//匹配连续多个字符
public static void test3(){
String str="jaaav";
String patternStr="j(a*)v";
boolean result = Pattern.matches(patternStr, str);
if (result) {
System.out.println("字符串"+str+"匹配模式"+patternStr+"成功");
}
else{
System.out.println("字符串"+str+"匹配模式"+patternStr+"失败");
}
}
//方括号中只允许的单个字符,模式“b[aeiou]n”指定,
// 只有以b开头,n结尾,中间是a,e,i,o,u中任意一个的才能匹配上,
// 所以数组的前五个可以匹配,后两个元素无法匹配.
//方括号[]表示只有其中指定的字符才能匹配.
public static void test4(){
String[] dataArr = { "ban", "ben", "bin", "bon" ,"bun","byn","baen"};
for (String str : dataArr) {
String patternStr="b[aeiou]n";
boolean result = Pattern.matches(patternStr, str);
if (result) {
System.out.println("字符串"+str+"匹配模式"+patternStr+"成功");
}
else{
System.out.println("字符串"+str+"匹配模式"+patternStr+"失败");
}
}
}
//如果需要匹配多个字符,那么[]就不能用上了,
// 这里我们可以用()加上|来代替,()表示一组,|表示或的关系,
// 模式b(ee|ea|oo)n就能匹配been,bean,boon等.
// 因此前三个能匹配上,而后两个不能.
public static void test5(){
String[] dataArr = { "been", "bean", "boon", "buin" ,"bynn"};
for (String str : dataArr) {
String patternStr="b(ee|ea|oo)n";
boolean result = Pattern.matches(patternStr, str);
if (result) {
System.out.println("字符串"+str+"匹配模式"+patternStr+"成功");
}
else{
System.out.println("字符串"+str+"匹配模式"+patternStr+"失败");
}
}
}
//String类的split函数支持正则表达式,上例中模式能匹配”,”,
// 单个空格,”;”中的一个,split函数能把它们中任意一个当作分隔符,
// 将一个字符串劈分成字符串数组.
public static void test6(){
String str="薪水,职位 姓名;年龄 性别";
String[] dataArr =str.split("[,\\s;]");
for (String strTmp : dataArr) {
System.out.println(strTmp);
}
}
public static void test7(){
String[] dataArr = { "google", "gooogle", "gooooogle", "goooooogle","ggle"};
for (String str : dataArr) {
String patternStr = "g(o{2,5})gle";
boolean result = Pattern.matches(patternStr, str);
if (result) {
System.out.println("字符串" + str + "匹配模式" + patternStr + "成功");
} else {
System.out.println("字符串" + str + "匹配模式" + patternStr + "失败");
}
}
}
public static void test8(){
String[] dataArr = { "Tan", "Tbn", "Tcn", "Ton","Twn"};
for (String str : dataArr) {
String regex = "T[a-c]n";
boolean result = Pattern.matches(regex, str);
if (result) {
System.out.println("字符串" + str + "匹配模式" + regex + "成功");
} else {
System.out.println("字符串" + str + "匹配模式" + regex + "失败");
}
}
}
//匹配以x开头包含@tal和.的字符串
public static void test9(){
String str="[email protected]";
String patternStr="^x\\w*(@tal)\\w*\\.\\w*";
boolean result = Pattern.matches(patternStr, str);
if (result) {
System.out.println("字符串"+str+"匹配模式"+patternStr+"成功");
}
else{
System.out.println("字符串"+str+"匹配模式"+patternStr+"失败");
}
}
}
实际例子:
一行log:
121.56.62.86 - z.m.tv.sohu.com [29/May/2019:14:52:59 0800] "jsonlog={"atype":"apps","channelid":"315","cv":"7.2.0","enterid":"0","imei":"865939038074368","mfo":"vivo","mfov":"vivo Y67A","mos":"android","mosv":"6.0","msg":"imp","mtype":"6","passport":"","pro":"1","sim":"1","startid":"1559111507899","tkey":"4ff95d6e341645c9ce550d3908c130289a70a3e8","uid":"bcfefc656480478864ed429caf0fddaa","vids":[{"catecode":"101154;101147","datatype":2,"idx":"0002","mdu":"0004","memo":"{\"from_page\":\"1\",\"abmod\":\"\"}","pg":"61000","playlistid":9096123,"scn":"02","site":1,"time":1559111502980,"vid":2849803},{"catecode":"101154;101147","datatype":2,"idx":"0003","mdu":"0004","memo":"{\"from_page\":\"1\",\"abmod\":\"\"}","pg":"61000","playlistid":9096123,"scn":"02","site":1,"time":1559111502980,"vid":2849806},{"catecode":"101154;101147","datatype":2,"idx":"0004","mdu":"0004","memo":"{\"from_page\":\"1\",\"abmod\":\"\"}","pg":"61000","playlistid":9096123,"scn":"02","site":1,"time":1559111502980,"vid":2851842},{"catecode":"101154;101147","datatype":2,"idx":"0005","mdu":"0004","memo":"{\"from_page\":\"1\",\"abmod\":\"\"}","pg":"61000","playlistid":9096123,"scn":"02","site":1,"time":1559111502981,"vid":2851846},{"catecode":"101154;101147","datatype":2,"idx":"0006","mdu":"0004","memo":"{\"from_page\":\"1\",\"abmod\":\"\"}","pg":"61000","playlistid":9096123,"scn":"02","site":1,"time":1559111502981,"vid":2853761},{"catecode":"101154;101147","datatype":2,"idx":"0007","mdu":"0004","memo":"{\"from_page\":\"1\",\"abmod\":\"\"}","pg":"61000","playlistid":9096123,"scn":"02","site":1,"time":1559111502981,"vid":2853764},{"catecode":"101154;101147","datatype":2,"idx":"0001","mdu":"0002","memo":"{\"from_page\":\"1\",\"abmod\":\"\"}","pg":"61000","playlistid":9096124,"scn":"02","site":1,"time":1559111502982,"vid":2843185},{"catecode":"101154;101147","datatype":2,"idx":"0002","mdu":"0002","memo":"{\"from_page\":\"1\",\"abmod\":\"\"}","pg":"61000","playlistid":9096124,"scn":"02","site":1,"time":1559111502982,"vid":2843182},{"catecode":"101109","datatype":2,"idx":"0003","mdu":"0002","memo":"{\"from_page\":\"1\",\"abmod\":\"\"}","pg":"61000","playlistid":9098247,"scn":"02","site":1,"time":1559111502982,"vid":2887825},{"catecode":"101154;101147","datatype":2,"idx":"0001","mdu":"0004","memo":"{\"from_page\":\"1\",\"abmod\":\"\"}","pg":"61000","playlistid":9096123,"scn":"02","site":1,"time":1559111507927,"vid":2847533},{"catecode":"101154;101147","datatype":2,"idx":"0008","mdu":"0004","memo":"{\"from_page\":\"1\",\"abmod\":\"\"}","pg":"61000","playlistid":9096123,"scn":"02","site":1,"time":1559112778890,"vid":2854747}],"webtype":"WiFi"}" 204 0 "okhttp/3.12.2" 正则:
private static Pattern pattern = Pattern .compile("^([0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.[0-9]{1,3}.*) - .* \\[(.*)\\] \"jsonlog=(.*)\" [0-9]{3} [0-9]{1,5} \"(.*)\"$");