Matcher.group() is returning a part of expected result .For ex a url "www.google.com" , my matcher is returning "www."

sandeep dhami :

My requirement is to check a URL in a string using regex. What I am doing is using Pattern and Matcher finding whether a string contains URL or not

 val pattern = Pattern.compile(HyperlinkParser.validRegex.toString())
    val matcher = pattern.matcher(htmlParsedMessage) //"abcd www.google.com def"
    while (matcher.find()) {
        val url = matcher.group()//contains the required url but it returns "www.".Expected "www.google.com"
        val indicesPair = Pair(matcher.start(), matcher.end())
        hyperlinkStartEndIndicesList.add(indicesPair)
    }
    matcher.reset()

Where HyperlinkParser.validRegexis

private const val regularExpression = "(?:(?:https?|ftp|file):|www.|ftp.)(?:([-A-Z0-9+&@#/%=~_|\$?!:,.]*)|[-A-Z0-9+&@#/%=~_|\$?!:,.])*(?:([-A-Z0-9+&@#/%=~_|\$?!:,.]*)|[A-Z0-9+&@#/%=~_|\$])"
val validRegex = Regex(regularExpression,RegexOption.IGNORE_CASE)

I am expecting the URL "www.google.com" but it is returning "www.".

Any ideas what can be the issue. Any help would be greatly accepted.

Andreas :

The documentation of the toString() method of Regex:

Returns the string representation of this regular expression, namely the pattern of this regular expression.

Note that another regular expression constructed from the same pattern string may have different options and may match strings differently.

Which means that it is the same as the regularExpression string without the IGNORE_CASE option.

So when you do val pattern = Pattern.compile(HyperlinkParser.validRegex.toString()), you lose the case-insensitive option, and that's why google.com isn't matched, since your regex only matches A-Z.

Change that line to:

val pattern = HyperlinkParser.validRegex.toPattern()

That will work, because the documentation of toPattern says:

Returns an instance of Pattern with the same pattern string and options as this instance of Regex has.

Provides the way to use Regex where Pattern is required.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=27135&siteId=1