Java uses regular expressions from the extracted values of specific fields json string

Disclaimer: This article is a blogger original article, follow the CC 4.0 BY-SA copyright agreement, reproduced, please attach the original source link and this statement.
This link: https://blog.csdn.net/qq_23853743/article/details/102605045

Today doing the project needs to be removed from a specific josn a specific string value, the Internet to find some way, do not really do this, I used to write a regular expression method to others.


import java.util.regex.Matcher;
import java.util.regex.Pattern;

public class StringJosn {
  public static void main(String[] args) {
    String josn = "{\"access_token\": \"IGQVJXekVYR1ZAiNU5EdHdmWDZATVm1yeldlTUxLZA2tqQ1B5c01wSjBWQy1rSWhjckZAvTzZAHcHh5S29ieTF0VmJ4b1lfQVlablFuMXJ4SGdTOHlETVNtbDE3TmJpNnZAEME1URTBZAOS1UNmw4dnliMjI3UkpQc0I3aWFPUWpN\", \"user_id\": 17841401135016050}";
    System.out.println(findJosnValue("access_token", josn));
  }

  /*
   * 根据key和josn字符串取出特定的value
   */
  public static String findJosnValue(String key, String josn) {
    String regex = "\"" + key + "\": (\"(.*?)\"|(\\d*))";
    Matcher matcher = Pattern.compile(regex).matcher(josn);
    String value = null;
    if (matcher.find()) {
      value = matcher.group().split("\\:")[1].replace("\"", "").trim();
      System.out.println(value);
    }
    return value;
  }
}

Some classes and methods of description:

Pattern 

声明:public final class Pattern  implements java.io.Serializable

Pattern class has final modification, we can see that he can not inherited by subclasses.

Meaning : pattern classes, regular expressions compiled representation.

Note : Instances of this class are immutable, safe for use by multiple concurrent threads.

public static Pattern compile(String regex)将给定的正则表达式编译为模式。

Parameters: regex - To compile expressions

Results: given regular expression compiled into a mode

Exceptions: PatternSyntaxException - If the expression syntax is invalid

声明:public final class Matcher extends Object implements MatchResult

public Matcher matcher(CharSequence input)创建一个匹配器,匹配给定的输入与此模式。

Parameters: input - to match a sequence of characters

Results: This new model matcher

fixtures 

声明:public final class Matcher extends Object implements MatchResult

Matcher  class has final modification, we can see that he can not inherited by subclasses.

Meaning : matching class, by interpreting Pattern matching engine performs the operation of the character sequence.

Note : Instances of this class for multiple concurrent threads is unsafe.

public boolean find()尝试找到匹配模式的输入序列的下一个子序列。

The method starts from the beginning of the matching region, or if the call is successful previous method, and matching is not yet reset, the first character in the previous match does not match.

If a match can be obtained by the start, end group and methods, and more information.

Results: true if and only if the sequence of the input sequence matches the pattern matcher

public String group()返回与上一个匹配匹配的输入子序列。

For an input sequence s matcher m  , expression m.  group (), and s.  the substring (  m.  Start (),  m.  End ()) are equivalent.

Note that some patterns, for example a *, match the empty string. When the empty string in the input pattern matching success, this method returns an empty string.

Specified by: groupinterface MatchResult

Results: subsequence matched previous match (possibly empty), a string

Exceptions: IllegalStateException - If you have not already tried before the match, or a match operation failed

Java regular expression syntax

In other languages, \\ said: I want to insert a regular expression backslash (literally) in the positive, do not give it any special significance.

In Java, \\ said: I want to insert a backslash a regular expression, so the following character has special meaning.

So, in other languages ​​(such as Perl), a backslash \ is enough to have escaped role, and in Java regular expression in two backslash you need to have in order to be resolved to turn in other languages righteous action. Also can be simply understood in regular expressions Java, two \\ behalf of one of the other languages ​​\, which is why a digital representation of the regular expression is \\ d, which represents one ordinary backslash Yes\\\\.

character

Explanation

\

The next character is marked as a special character, text, back-references or octal escape. For example, "n" matches the character "n". "\ N" matches a newline character. Sequence "\\" match "\\", "\\ (" match "(."

^

Matches the input string starting position. If the set  RegExp  object  Multiline  properties, and also the position after the ^ "\ n" or "\ r" match.

$

Matches the position of the input end of the string. If you set  RegExp  object's  Multiline  property, and also the position before the $ "\ n" or "\ r" match.

*

Zero or more times matches the preceding character or sub-expression. For example, zo * matches "z" and "zoo". * Equivalent to {0}.

+

One or more times to match the preceding character or sub-expression. For example, "zo +" and "zo" and "zoo" match, but does not match the "z". + Is equivalent to {1}.

?

Zero or one matches the preceding character or sub-expression. For example, "do (es)?" Matches the "do" or "does" in the "do". ? Is equivalent to {0,1}.

{n}

is a nonnegative integer. Exactly matching  n  times. For example, "o {2}" and "Bob" in the "o" does not match, but the two "food" in the "o" match.

{n,}

is a nonnegative integer. Matching at least  times. For example, "o {2,}" mismatch "Bob" in the "o", and match all o "foooood" in. "o {1,}" is equivalent to "o +". "o {0,}" is equivalent to "o *".

{n,m}

m  and  n  are nonnegative integers, where  n  <=  m . Matching at least  n  times at most  m  times. For example, "o {1,3}" matching "fooooood" in the first three o. 'o {0,1}' is equivalent to 'o?'. Note: You can not insert spaces between commas and numbers.

?

When this character immediately any other qualifiers (*, +,?, { N- }, { n- ,}, { n- , m }) Thereafter, when the pattern matching is "non-greedy." "Non-greedy" pattern matching to search for possible short string, and the default "greedy" pattern matching to search for possible long string. For example, the string "oooo" in, "o +?" Matches only a single "o", and "o +" match all "o".

.

Match any single character except "\ r \ n" is. To match any character including "\ r \ n", including the use mode such as "[\ s \ S]" or the like.

(pattern)

Matching  pattern  and capture subexpression of the match. You can use  $ 0 ... $ 9  properties result from "matching" to retrieve the set of matching the captured. To match parentheses characters (), use "\ (" or "\)."

(?:pattern)

Matching  pattern  sub-expression but does not capture the match, that it is a non-capturing match, not to store for later use in the match. This use "or" character | mode is useful when the combination of components (). For example, 'industr (:? Y | ies) is a ratio of' industry | expression more economical industries'.

(?=pattern)

Performing forward prediction subexpression first search, the expression matches in a match  pattern  string string starting point. It is a non-capturing match, that does not capture the match for later use. For example, 'Windows (= 95 |? 98 | NT | 2000)' match "Windows 2000" in the "Windows", but does not match the "Windows 3.1" "Windows". Lookahead do not take character, that is, after a match occurs, the next match for your search immediately after the previous match, rather than the composition of the prediction after the first character.

(?!pattern)

Subexpression perform a reverse lookahead search, which matches the expression is not in the match  pattern  starting point string search string. It is a non-capturing match, that does not capture the match for later use. For example, 'Windows (95 |?! 98 | NT | 2000)' Matching "Windows 3.1" "Windows", but does not match the "Windows 2000" in the "Windows". Lookahead do not take character, that is, after a match occurs, the next match for your search immediately after the previous match, rather than the composition of the prediction after the first character.

x | Y

Match  x  or  Y . For example, 'z | food' match "z" or "food". '(z | f) ood' match "zood" or "food".

[xyz]

character set. Matches any character included. For example, "[abc]" matches "plain" in "a".

[^xyz]

Reverse character set. Matches any character not included. For example, "[^ abc]" matches "plain" in "p", "l", "i", "n".

[a-z]

Range of characters. Matches any character within the specified range. For example, "[az]" matches "a" to any of the lowercase letters "z" range.

[^a-z]

Reverse range of characters. Matches any character not within the specified range. For example, "[^ az]" not match any "a" to any character in the "z" range.

\b

Matches a word boundary, that is, the position between a word and a space. For example, "er \ b" match "never" in "er", but does not match the "verb" in "er".

\B

Non-word boundary matching. "Er \ B" matches "verb" in "er", but does not match "never" in "er".

\cx

Match  x  control character indicated. For example, \ cM matching Control-M or carriage return. x  values must be between AZ or az. If not, it is assumed that c is "c" character itself.

\d

Numeric characters match. It is equivalent to [0-9].

\D

Non-numeric characters match. It is equivalent to [^ 0-9].

\f

In other matching page break. Equivalent to \ x0c and \ cL.

\n

Newline match. Equivalent to \ x0a and \ cJ.

\r

Matching a carriage return. Equivalent to \ x0d and \ cM.

\s

Matches any whitespace characters, including spaces, tabs, page breaks and so on. And [\ f \ n \ r \ t \ v] equivalent.

\S

Matches any non-whitespace characters. And [^ \ f \ n \ r \ t \ v] equivalent.

\t

Tabs match. And \ x09 and \ cI equivalent.

\ v

Vertical tabs match. And \ x0b and \ cK equivalent.

\w

匹配任何字类字符,包括下划线。与"[A-Za-z0-9_]"等效。

\W

与任何非单词字符匹配。与"[^A-Za-z0-9_]"等效。

\xn

匹配 n,此处的 n 是一个十六进制转义码。十六进制转义码必须正好是两位数长。例如,"\x41"匹配"A"。"\x041"与"\x04"&"1"等效。允许在正则表达式中使用 ASCII 代码。

\num

匹配 num,此处的 num 是一个正整数。到捕获匹配的反向引用。例如,"(.)\1"匹配两个连续的相同字符。

\n

标识一个八进制转义码或反向引用。如果 \n 前面至少有 n 个捕获子表达式,那么 n 是反向引用。否则,如果 n 是八进制数 (0-7),那么 n是八进制转义码。

\nm

标识一个八进制转义码或反向引用。如果 \nm 前面至少有 nm 个捕获子表达式,那么 nm 是反向引用。如果 \nm 前面至少有 n 个捕获,则 n 是反向引用,后面跟有字符 m。如果两种前面的情况都不存在,则 \nm 匹配八进制值 nm,其中 和 m 是八进制数字 (0-7)。

\nml

当 n 是八进制数 (0-3),m 和 l 是八进制数 (0-7) 时,匹配八进制转义码 nml

\un

匹配 n,其中 n 是以四位十六进制数表示的 Unicode 字符。例如,\u00A9 匹配版权符号 (©)。

根据 Java Language Specification 的要求,Java 源代码的字符串中的反斜线被解释为 Unicode 转义或其他字符转义。因此必须在字符串字面值中使用两个反斜线,表示正则表达式受到保护,不被 Java 字节码编译器解释。例如,当解释为正则表达式时,字符串字面值 "\b" 与单个退格字符匹配,而 "\\b" 与单词边界匹配。字符串字面值 "\(hello\)" 是非法的,将导致编译时错误;要与字符串 (hello) 匹配,必须使用字符串字面值 "\\(hello\\)"。

Guess you like

Origin blog.csdn.net/qq_23853743/article/details/102605045