When using Matcher's find()
method, a partial match returns false but the matcher's position moves anyway. A subsequent invocation of find()
omits those partially matched characters.
Example of a partial match: the pattern "[0-9]+:[0-9]"
against the input "a3;9"
. This pattern doesn't match against any part of the input, so find()
returns false, but the subpattern "[0-9]+"
matches against "3"
. If we change the pattern at this point and call find()
again, the characters to the left of, and including the partial match, are not tested for a new match.
Note that pattern "[0-9]:[0-9]"
(without the quantifier) doesn't produce this effect.
Is this normal behaviour?
Example: in the first for loop, the third pattern [0-9]
matches against character "9"
and "3"
is not reported as a match. In the second loop, pattern [0-9]
matches against character "3"
.
import java.util.regex.*;
public class Test {
public static void main(String[] args) {
final String INPUT = "a3;9";
String[] patterns = {"a", "[0-9]+:[0-9]", "[0-9]"};
Matcher matcher = Pattern.compile(".*").matcher(INPUT);
System.out.printf("Input: %s%n", INPUT);
matcher.reset();
for (String s: patterns)
testPattern(matcher, s);
System.out.println("=======================================");
patterns = new String[] {"a", "[0-9]:[0-9]", "[0-9]"};
matcher.reset();
for (String s: patterns)
testPattern(matcher, s);
}
static void testPattern(Matcher m, String re) {
m.usePattern(Pattern.compile(re));
System.out.printf("Using regex: %s%n", m.pattern().toString());
// Testing for pattern
if(m.find())
System.out.printf("Found %s, end-pos: %d%n", m.group(), m.end());
}
}
Matcher proposes three different kind of match operations (see javadoc)
- matches
for an entire input match
- find
for a traversal skipping unmatched
- lookingAt
that does a partial match from the start of the sequence
When a pattern is found by lookingAt
invoking matcher.region(matcher.end(), matcher.regionEnd())
or such can be used for consecutive pattern.
(Most of the credit goes to the OP self)