Today doing the project needs to be removed from a specific josn a specific string value, the Internet to find some way, do not really do this, I used to write a regular expression method to others.
import java.util.regex.Matcher;
import java.util.regex.Pattern;
public class StringJosn {
public static void main(String[] args) {
String josn = "{\"access_token\": \"IGQVJXekVYR1ZAiNU5EdHdmWDZATVm1yeldlTUxLZA2tqQ1B5c01wSjBWQy1rSWhjckZAvTzZAHcHh5S29ieTF0VmJ4b1lfQVlablFuMXJ4SGdTOHlETVNtbDE3TmJpNnZAEME1URTBZAOS1UNmw4dnliMjI3UkpQc0I3aWFPUWpN\", \"user_id\": 17841401135016050}";
System.out.println(findJosnValue("access_token", josn));
}
/*
* 根据key和josn字符串取出特定的value
*/
public static String findJosnValue(String key, String josn) {
String regex = "\"" + key + "\": (\"(.*?)\"|(\\d*))";
Matcher matcher = Pattern.compile(regex).matcher(josn);
String value = null;
if (matcher.find()) {
value = matcher.group().split("\\:")[1].replace("\"", "").trim();
System.out.println(value);
}
return value;
}
}
Some classes and methods of description:
Pattern
声明:public final class Pattern implements java.io.Serializable
Pattern class has final modification, we can see that he can not inherited by subclasses.
Meaning : pattern classes, regular expressions compiled representation.
Note : Instances of this class are immutable, safe for use by multiple concurrent threads.
public static Pattern compile(String regex)将给定的正则表达式编译为模式。
Parameters:
regex
- To compile expressionsResults: given regular expression compiled into a mode
Exceptions:
PatternSyntaxException
- If the expression syntax is invalid声明:public final class Matcher extends Object implements MatchResult
public Matcher matcher(CharSequence input)创建一个匹配器,匹配给定的输入与此模式。
Parameters:
input
- to match a sequence of charactersResults: This new model matcher
fixtures
声明:public final class Matcher extends Object implements MatchResult
Matcher class has final modification, we can see that he can not inherited by subclasses.
Meaning : matching class, by interpreting Pattern matching engine performs the operation of the character sequence.
Note : Instances of this class for multiple concurrent threads is unsafe.
public boolean find()尝试找到匹配模式的输入序列的下一个子序列。
The method starts from the beginning of the matching region, or if the call is successful previous method, and matching is not yet reset, the first character in the previous match does not match.
If a match can be obtained by the start, end group and methods, and more information.
Results: true if and only if the sequence of the input sequence matches the pattern matcher
public String group()返回与上一个匹配匹配的输入子序列。
For an input sequence s matcher m , expression m. group (), and s. the substring ( m. Start (), m. End ()) are equivalent.
Note that some patterns, for example a *, match the empty string. When the empty string in the input pattern matching success, this method returns an empty string.
Specified by:
group
interfaceMatchResult
Results: subsequence matched previous match (possibly empty), a string
Exceptions:
IllegalStateException
- If you have not already tried before the match, or a match operation failed
Java regular expression syntax
In other languages, \\ said: I want to insert a regular expression backslash (literally) in the positive, do not give it any special significance.
In Java, \\ said: I want to insert a backslash a regular expression, so the following character has special meaning.
So, in other languages (such as Perl), a backslash \ is enough to have escaped role, and in Java regular expression in two backslash you need to have in order to be resolved to turn in other languages righteous action. Also can be simply understood in regular expressions Java, two \\ behalf of one of the other languages \, which is why a digital representation of the regular expression is \\ d, which represents one ordinary backslash Yes\\\\.
character |
Explanation |
---|---|
\ |
The next character is marked as a special character, text, back-references or octal escape. For example, "n" matches the character "n". "\ N" matches a newline character. Sequence "\\" match "\\", "\\ (" match "(." |
^ |
Matches the input string starting position. If the set RegExp object Multiline properties, and also the position after the ^ "\ n" or "\ r" match. |
$ |
Matches the position of the input end of the string. If you set RegExp object's Multiline property, and also the position before the $ "\ n" or "\ r" match. |
* |
Zero or more times matches the preceding character or sub-expression. For example, zo * matches "z" and "zoo". * Equivalent to {0}. |
+ |
One or more times to match the preceding character or sub-expression. For example, "zo +" and "zo" and "zoo" match, but does not match the "z". + Is equivalent to {1}. |
? |
Zero or one matches the preceding character or sub-expression. For example, "do (es)?" Matches the "do" or "does" in the "do". ? Is equivalent to {0,1}. |
{n} |
n is a nonnegative integer. Exactly matching n times. For example, "o {2}" and "Bob" in the "o" does not match, but the two "food" in the "o" match. |
{n,} |
n is a nonnegative integer. Matching at least n times. For example, "o {2,}" mismatch "Bob" in the "o", and match all o "foooood" in. "o {1,}" is equivalent to "o +". "o {0,}" is equivalent to "o *". |
{n,m} |
m and n are nonnegative integers, where n <= m . Matching at least n times at most m times. For example, "o {1,3}" matching "fooooood" in the first three o. 'o {0,1}' is equivalent to 'o?'. Note: You can not insert spaces between commas and numbers. |
? |
When this character immediately any other qualifiers (*, +,?, { N- }, { n- ,}, { n- , m }) Thereafter, when the pattern matching is "non-greedy." "Non-greedy" pattern matching to search for possible short string, and the default "greedy" pattern matching to search for possible long string. For example, the string "oooo" in, "o +?" Matches only a single "o", and "o +" match all "o". |
. |
Match any single character except "\ r \ n" is. To match any character including "\ r \ n", including the use mode such as "[\ s \ S]" or the like. |
(pattern) |
Matching pattern and capture subexpression of the match. You can use $ 0 ... $ 9 properties result from "matching" to retrieve the set of matching the captured. To match parentheses characters (), use "\ (" or "\)." |
(?:pattern) |
Matching pattern sub-expression but does not capture the match, that it is a non-capturing match, not to store for later use in the match. This use "or" character | mode is useful when the combination of components (). For example, 'industr (:? Y | ies) is a ratio of' industry | expression more economical industries'. |
(?=pattern) |
Performing forward prediction subexpression first search, the expression matches in a match pattern string string starting point. It is a non-capturing match, that does not capture the match for later use. For example, 'Windows (= 95 |? 98 | NT | 2000)' match "Windows 2000" in the "Windows", but does not match the "Windows 3.1" "Windows". Lookahead do not take character, that is, after a match occurs, the next match for your search immediately after the previous match, rather than the composition of the prediction after the first character. |
(?!pattern) |
Subexpression perform a reverse lookahead search, which matches the expression is not in the match pattern starting point string search string. It is a non-capturing match, that does not capture the match for later use. For example, 'Windows (95 |?! 98 | NT | 2000)' Matching "Windows 3.1" "Windows", but does not match the "Windows 2000" in the "Windows". Lookahead do not take character, that is, after a match occurs, the next match for your search immediately after the previous match, rather than the composition of the prediction after the first character. |
x | Y |
Match x or Y . For example, 'z | food' match "z" or "food". '(z | f) ood' match "zood" or "food". |
[xyz] |
character set. Matches any character included. For example, "[abc]" matches "plain" in "a". |
[^xyz] |
Reverse character set. Matches any character not included. For example, "[^ abc]" matches "plain" in "p", "l", "i", "n". |
[a-z] |
Range of characters. Matches any character within the specified range. For example, "[az]" matches "a" to any of the lowercase letters "z" range. |
[^a-z] |
Reverse range of characters. Matches any character not within the specified range. For example, "[^ az]" not match any "a" to any character in the "z" range. |
\b |
Matches a word boundary, that is, the position between a word and a space. For example, "er \ b" match "never" in "er", but does not match the "verb" in "er". |
\B |
Non-word boundary matching. "Er \ B" matches "verb" in "er", but does not match "never" in "er". |
\cx |
Match x control character indicated. For example, \ cM matching Control-M or carriage return. x values must be between AZ or az. If not, it is assumed that c is "c" character itself. |
\d |
Numeric characters match. It is equivalent to [0-9]. |
\D |
Non-numeric characters match. It is equivalent to [^ 0-9]. |
\f |
In other matching page break. Equivalent to \ x0c and \ cL. |
\n |
Newline match. Equivalent to \ x0a and \ cJ. |
\r |
Matching a carriage return. Equivalent to \ x0d and \ cM. |
\s |
Matches any whitespace characters, including spaces, tabs, page breaks and so on. And [\ f \ n \ r \ t \ v] equivalent. |
\S |
Matches any non-whitespace characters. And [^ \ f \ n \ r \ t \ v] equivalent. |
\t |
Tabs match. And \ x09 and \ cI equivalent. |
\ v |
Vertical tabs match. And \ x0b and \ cK equivalent. |
\w |
匹配任何字类字符,包括下划线。与"[A-Za-z0-9_]"等效。 |
\W |
与任何非单词字符匹配。与"[^A-Za-z0-9_]"等效。 |
\xn |
匹配 n,此处的 n 是一个十六进制转义码。十六进制转义码必须正好是两位数长。例如,"\x41"匹配"A"。"\x041"与"\x04"&"1"等效。允许在正则表达式中使用 ASCII 代码。 |
\num |
匹配 num,此处的 num 是一个正整数。到捕获匹配的反向引用。例如,"(.)\1"匹配两个连续的相同字符。 |
\n |
标识一个八进制转义码或反向引用。如果 \n 前面至少有 n 个捕获子表达式,那么 n 是反向引用。否则,如果 n 是八进制数 (0-7),那么 n是八进制转义码。 |
\nm |
标识一个八进制转义码或反向引用。如果 \nm 前面至少有 nm 个捕获子表达式,那么 nm 是反向引用。如果 \nm 前面至少有 n 个捕获,则 n 是反向引用,后面跟有字符 m。如果两种前面的情况都不存在,则 \nm 匹配八进制值 nm,其中 n 和 m 是八进制数字 (0-7)。 |
\nml |
当 n 是八进制数 (0-3),m 和 l 是八进制数 (0-7) 时,匹配八进制转义码 nml。 |
\un |
匹配 n,其中 n 是以四位十六进制数表示的 Unicode 字符。例如,\u00A9 匹配版权符号 (©)。 |
根据 Java Language Specification 的要求,Java 源代码的字符串中的反斜线被解释为 Unicode 转义或其他字符转义。因此必须在字符串字面值中使用两个反斜线,表示正则表达式受到保护,不被 Java 字节码编译器解释。例如,当解释为正则表达式时,字符串字面值 "\b" 与单个退格字符匹配,而 "\\b" 与单词边界匹配。字符串字面值 "\(hello\)" 是非法的,将导致编译时错误;要与字符串 (hello) 匹配,必须使用字符串字面值 "\\(hello\\)"。