A partial match changes the Matcher's position

Cutter :

When using Matcher's find() method, a partial match returns false but the matcher's position moves anyway. A subsequent invocation of find() omits those partially matched characters.

Example of a partial match: the pattern "[0-9]+:[0-9]" against the input "a3;9". This pattern doesn't match against any part of the input, so find() returns false, but the subpattern "[0-9]+" matches against "3". If we change the pattern at this point and call find() again, the characters to the left of, and including the partial match, are not tested for a new match.

Note that pattern "[0-9]:[0-9]" (without the quantifier) doesn't produce this effect.

Is this normal behaviour?

Example: in the first for loop, the third pattern [0-9] matches against character "9" and "3" is not reported as a match. In the second loop, pattern [0-9] matches against character "3".

import java.util.regex.*;

public class Test {
    public static void main(String[] args) {
        final String INPUT = "a3;9";
        String[] patterns = {"a", "[0-9]+:[0-9]", "[0-9]"};

        Matcher matcher = Pattern.compile(".*").matcher(INPUT);

        System.out.printf("Input: %s%n", INPUT);
        matcher.reset();
        for (String s: patterns)
            testPattern(matcher, s);

        System.out.println("=======================================");

        patterns = new String[] {"a", "[0-9]:[0-9]", "[0-9]"};
        matcher.reset();
        for (String s: patterns)
            testPattern(matcher, s);
    }

    static void testPattern(Matcher m, String re) {     
        m.usePattern(Pattern.compile(re));
        System.out.printf("Using regex: %s%n", m.pattern().toString());

        // Testing for pattern
        if(m.find())
            System.out.printf("Found %s, end-pos: %d%n", m.group(), m.end());
    }
}
Joop Eggen :

Matcher proposes three different kind of match operations (see javadoc) - matches for an entire input match - find for a traversal skipping unmatched - lookingAt that does a partial match from the start of the sequence

When a pattern is found by lookingAt invoking matcher.region(matcher.end(), matcher.regionEnd()) or such can be used for consecutive pattern.

(Most of the credit goes to the OP self)

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=442495&siteId=1