Can i combine unicode categories in Regex?

Gudsaf :

I want to get such set of symbols:

  1. \P{L} unicode category use as base
  2. add хХxXтТTоОoO0 symbols to \P{L} unicode category
  3. do not use symbols -_.

By that i get such regex in Java:

[[\P{L}]&&[^-_.]&&[хХxXтТTоОoO0]]

But this not working, what's wrong?

The fourth bird :

Reading this page using &&[хХxXтТTоОoO0] means an intersection.

You could add matching хХxXтТTоОoO0 to the first character class [\\P{L}хХxXтТTоОoO0]

Then use subtraction for that character class using &&[^-_.]

[[\\P{L}хХxXтТTоОoO0]&&[^-_.]] 

Java demo

Example

final String regex = "[[\\P{L}хХxXтТTоОoO0]&&[^-_.]]";
final String string = "aTo-_.#$";

final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);

while (matcher.find()) {
    System.out.println(matcher.group(0));
}

Output

T
o
#
$

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=417287&siteId=1