How to write a regex capture group which matches a character 3 or 4 times before a delimiter?

Cuga :

I'm trying to write a regex that splits elements out according to a delimiter. The regex also needs to ensure there are ideally 4, but at least 3 colons : in each match.

Here's an example string:

"Checkers, etc:Blue::C, Backgammon, I say:Green::Pepsi:P, Chess, misc:White:Coke:Florida:A, :::U"

From this, there should be 4 matches:

  • Checkers, etc:Blue::C
  • Backgammon, I say:Green::Pepsi:P
  • Chess, misc:White:Coke:Florida:A
  • :::U

Here's what I've tried so far:

([^:]*:[^:]*){3,4}(?:, )

Regex 101 at: https://regex101.com/r/O8iacP/8

I tried setting up a non-capturing group for ,

Then I tried matching a group of any character that's not a :, a :, and any character that's not a : 3 or 4 times.

The code I'm using to iterate over these groups is:

String line = "Checkers, etc:Blue::C, Backgammon, I say::Pepsi:P, Chess:White:Coke:Florida:A, :::U";
String pattern = "([^:]*:[^:]*){3,4}(?:, )";

  // Create a Pattern object
  Pattern r = Pattern.compile(pattern);

  // Now create matcher object.
  Matcher matcher = r.matcher(line);
  while (matcher.find()) {
        System.out.println(matcher.group(1));
    }

Any help is appreciated!

Edit

Using @Casimir's regex, it's working. I had to change the above code to use group(0) like this:

String line = "Checkers, etc:Blue::C, Backgammon, I say::Pepsi:P, Chess:White:Coke:Florida:A, :::U";
String pattern = "(?![\\s,])(?:[^:]*:){3}\\S*(?![^,])";

// Create a Pattern object
Pattern r = Pattern.compile(pattern);

// Now create matcher object.
Matcher matcher = r.matcher(line);
while (matcher.find()) {
    System.out.println(matcher.group(0));
}

Now prints:

Checkers, etc:Blue::C
Backgammon, I say::Pepsi:P
Chess:White:Coke:Florida:A
:::U

Thanks again!

Casimir et Hippolyte :

I suggest this pattern:

(?![\\s,])(?:[^:]*:){3}\\S*(?![^,])

Negative lookaheads avoid to match leading or trailing delimiters. The second one in particular forces the match to be followed by the delimiter or the end of the string (not followed by a character that isn't a comma).

demo

Note that the pattern doesn't have capture groups, so the result is the whole match (or group 0).

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=305939&siteId=1