Why does Apache Commons consider '१२३' numeric?

Hannes :

According to Apache Commons Lang's documentation for StringUtils.isNumeric(), the String '१२३' is numeric.

Since I believed this might be a mistake in the documentation, I ran tests to verify the statement. I found that according to Apache Commons it is numeric.

Why is this String numeric? What do those characters represent?

Andy Turner :

Because that "CharSequence contains only Unicode digits" (quoting your linked documentation).

All of the characters return true for Character.isDigit:

Some Unicode character ranges that contain digits:

  • '\u0030' through '\u0039', ISO-LATIN-1 digits ('0' through '9')
  • '\u0660' through '\u0669', Arabic-Indic digits
  • '\u06F0' through '\u06F9', Extended Arabic-Indic digits
  • '\u0966' through '\u096F', Devanagari digits
  • '\uFF10' through '\uFF19', Fullwidth digits

Many other character ranges contain digits as well.

१२३ are Devanagari digits:

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=419357&siteId=1