String to character array returning different result in Visual Studio and Android Studio

bluetoothfx :

The string that I want to convert into character array is ষ্টোর it is in Unicode and a Bengali word.

The problem is when I am converting it in Visual studio then it is returning 6 characters but when I am converting it in Android Studio then it is showing 5 characters.

In VS I am using char[] arrayOfChars = someString.ToCharArray(); and in Android Studio char[] arrayOfChars = someString.toCharArray();

Visual Studio Debugging info

Android Studio Debugging info

N:B: My Android Studio IDE and Project Encoding is UTF-8. I am expecting same result as Visual Studio in Android Studio.

Jeremy :

Those two arrays are unicode equivalent, but are being represented by different normalization forms. What seems to be happening is that the Java ToCharArray (or string representation) is using one normalization form, while the C# ToCharArray (or string representation) is using another.

This page contains a chart of different normalization forms for Bengali text - the fourth row there describes exactly what you're seeing:

Bengali table

I am only learning about this now, but it seems to me that the motivation for this is so that unicode implementations could remain compatible with pre-existing encodings wherever possible and practical.

For example, one pre-existing encoding may have used a single unicode character, while another pre-existing encoding may have instead used two characters combined. The solution settled on by the unicode folks is thus to support both, at the cost of not having a single "canonical" representation, as you've encountered here.

If you wish for your Java array to be normalized under the "D" normalization form that your C# array seems to be using, it appears that this page provides such a function. You may be looking for something like:

someString = Normalizer.normalize(someString, Normalizer.Form.NFD);

Unicode standard annex 15 is the official document that describes these normalization forms.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=449767&siteId=1