Extracting a group from matched String in Java using regex

Adhyatmik :

I have a list of String containing values like this:

String [] arr = {"${US.IDX_CA}", "${UK.IDX_IO}", "${NZ.IDX_BO}", "${JP.IDX_TK}", "${US.IDX_MT}", "more-elements-with-completely-different-patterns-which-is-irrelevant"};

I'm trying to extract all the IDX_XX from this list. So from above list, i should have, IDX_CA, IDX_IO, IDX_BO etc using regex in Java

I wrote following code:

Pattern pattern = Pattern.compile("(.*)IDX_(\\w{2})");
for (String s : arr){
     Matcher m = pattern.matcher(s);
      if (m.matches()){
        String extract = m.group(1);
        System.out.println(extract);
      }
}

But this does not print anything. Can someone please tell me what mistake am i making. Thanks.

Wiktor Stribiżew :

Use the following fix:

String [] arr = {"${US.IDX_CA}", "${UK.IDX_IO}", "${NZ.IDX_BO}", "${JP.IDX_TK}", "${US.IDX_MT}", "more-elements-with-completely-different-patterns-which-is-irrelevant"};
Pattern pattern = Pattern.compile("\\bIDX_(\\w{2})\\b");
for (String s : arr){
     Matcher m = pattern.matcher(s);
      while (m.find()){
        System.out.println(m.group(0)); // Get the whole match
        System.out.println(m.group(1)); // Get the 2 chars after IDX_
      }
}

See the Java demo, output:

IDX_CA
CA
IDX_IO
IO
IDX_BO
BO
IDX_TK
TK
IDX_MT
MT

NOTES:

  • Use \bIDX_(\w{2})\b pattern that matches IDX_ and 2 word chars in between word boundaries and captures the 2 chars after IDX_ into Group 1
  • m.matches needs a full string match, so it is replaced with m.find()
  • if replaced with while in case there are more than 1 match in a string
  • m.group(0) contains the whole match values
  • m.group(1) contains the Group 1 values.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=143003&siteId=1