As many people ,i am struggling with what it seems a "trivial" regex issue. in a given text, whenever I encounter a word within {} brackets i need to extract it.At first i used
"\\{-?(\\w{3,})\\}"
and it worked ok:
as long as the word didnt have any white space or special character like ' . For example {Project} returns Project.But {Project Test} or {Project D'arce} don't return anything. i know that for white characters i need to use \s.But it is absolutely not clear for me how to add to the above , i tried :
"%\\{-?(\\w(\\s{3,})\\)\\}"))
but not working.Also what if i want to add words containing a special characters like ' ??? Its really frustrating
You could use a character class [\w\s']
and add to it what you could allow to match:
\{-?([\w\s']{3,})}
In Java
String regex = "\\{-?([\\w\\s']{3,})}";
If you want to prevent matching only 3 whitespace chars, you could use a repeating group:
\{-?\h*([\w']{3,}(?:\h+[\w']+)*)\h*}
About the pattern
\{
Match{
char-?
Optional hyphen\h*
Match 0+ times a horizontal whitespace char([\w\s']{3,})
Capture in a group matching 3 or more times either a word char, whitespace char or '(?:\h[\w']+)*
Repeat 0+ times matching 1+ horizontal whitespace chars followed by what is listed in the character class\h*
Match 0+ times a horizontal whitespace char}
Match}
In Java
String regex = "\\{-?\\h*([\\w']{3,}(?:\\h+[\\w']+)*)\\h*}";