I've tried to follow the solution described here: https://stackoverflow.com/a/17973873/2149915 to try and match a string with the following requirements: - More than 3 characters repeated sequentially in the string should be matched and returned.
Examples:
- hello how are you... -> VALID
- hello how are you............. -> INVALID
- hiii -> VALID
- hiiiiii -> INVALID
and so on and so forth, the idea is to detect text that is nonsensical.
So far my solution was to modify the regex in the link as such.
ORIGINAL: ^(?!.*([A-Za-z0-9])\1{2})(?=.*[a-z])(?=.*\d)[A-Za-z0-9]+$
ADAPTED: ^(?!.*([A-Za-z0-9\.\,\/\|\\])\1{3})$
Essentially i removed the requirement for capture groups of numbers and alphanumerics seen here: (?=.*[a-z])(?=.*\d)[A-Za-z0-9]+
and tried to add extra detection of characters such as ./,\
etc but it doesnt seem to match at all with any characters...
Any ideas on how i can achieve this?
thanks in advance :)
EDIT: i found this regex: ^.*(\S)(?: ?\1){9,}.*$
on this question https://stackoverflow.com/a/44659071/2149915 and have adapted it to match only for 3 characters like such ^.*(\S)(?: ?\1){3}.*$
.
Now it detects things like:
- aaaa -> INVALID
- hello....... -> INVALID
- /////.... -> INVALID
however it does not take into account whitespace such as this:
. . . . .
is there a modification that can be done to achieve this?
I think there's a much simpler solution if you're looking for any character repeated more than 3 times:
String[] inputs = {
"hello how are you...", // -> VALID
"hello how are you.............", // -> INVALID
"hiii", // -> VALID
"hiiiiii" // -> INVALID
};
// | group 1 - any character
// | | back-reference
// | | | 4+ quantifier including previous instance
// | | | | dot represents any character,
// | | | | including whitespace and line feeds
// | | | |
Pattern p = Pattern.compile("(.)\\1{3,}", Pattern.DOTALL);
// iterating test inputs
for (String s: inputs) {
// matching
Matcher m = p.matcher(s);
// 4+ repeated character found
if (m.find()) {
System.out.printf(
"Input '%s' not valid, character '%s' repeated more than 3 times%n",
s,
m.group(1)
);
}
}
Output
Input 'hello how are you............. not valid', character '.' repeated more than 3 times
Input 'hiiiiii' not valid, character 'i' repeated more than 3 times
Input 'hello how are you' not valid, character ' ' repeated more than 3 times