Unity Arabic Adaptation (Ultimate Edition)

I have been working on Arabic adaptation recently and found that there is no complete solution on the Internet. This time I bring you a complete solution and a set of scalable code. I hope that each of us will encounter it again in the future. When adapting to Arabic, you can use this ultimate solution. At the same time, with the update of unity, more and more project teams are using TextMeshPro. The solution provided in this article can support both UGUI and TextmeshPro, and can be successfully adapted from unity2018 and above . I have not tested versions lower than this.

The most difficult part of Arabic adaptation is that you don’t know whether your results are right or wrong. When I did it, I just referred to Baidu Translation, and finally found someone who knew Arabic to proofread it. Later in the article, I will teach you a simple method to judge the display of Arabic by yourself.

Introduction to Arabic

   1.首先我们要来对阿文有一个基本的认知。阿文的显示是从右往左,右对齐。这里要注意,不只是顺序从右往左,每一个字符,也都是从右往左显示。
             原文:"ABCD EFG"   (斜体表示原始顺序)
             阿文:"GFE DCBA"   (非斜体表示正确顺序)
   2.阿拉伯语属闪含语系闪米特语族,是由28个辅音字母和12个发音符号(不包括叠音符)组成的拼音文字。书写顺序从右往左横行书写,翻阅顺序也是由右往左。阿拉伯语字母无大、小写之分,但有印刷体、手写体和艺术体之别。书写时,每个字母均有单写与连写之分。
        下图为部分辅音字母,简单点说, 28个辅音字母中的每一个都有4个字形,用于根据字母在初始,中间,最终或隔离中的位置来表示字母。(说白了就是,一个字母右4种写法,根据你的位置来决定怎么写,在适配的时候,你会看到一个字符串翻转过来以后,长得完全不一样)
   3.tashkeel(元音)
        元音我理解就像我们汉字里的拼音,一般来说,是不需要显示元音的。通常在《古兰经》、幼儿读物、阿语学习教材,或者给外国人阅读的书籍中才加元音符号。
        一个最简单的看法就是,你发现你的阿文里,上下有点点之类的时候,就表示他显示了元音
        大部分情况下,不需要显示原音
   4.数字的问题
        虽然我们日常使用的是数字叫阿拉伯数字,比如1234567890,但是实际上阿拉伯数字最初由古印度人发明,再由阿拉伯人传向欧洲,欧洲再将其现代化,人们误以为是阿拉伯人发明,所以称之为阿拉伯数
        字。实际上阿语中,有一套自己的数字系统,而且不止一套。我们在做阿语适配的时候,会将数值统一改为我们印象中的阿拉伯数字(1234567890),而不用阿拉伯文里数字。
        阿拉伯文的数值依然使用从左到右显示                                                             
   5.和英文,数字等混排的时候,英文和数值依然使用从左到右的顺序
        原文:"ABCD  hello 123  EFG"   (斜体表示原始顺序)
        阿文:"GFE 123  hello DCBA"   (非斜体表示正确顺序)
     以上5种,都是阿语语法和显示顺序的关系,网上有一个免费的插件
     也是应用范围最广的一个插件 [https://github.com/Konash/arabic-support-unity](https://github.com/Konash/arabic-support-unity)
     该插件使用简单,一句话就可以搞定
     string result = ArabicFixer.Fix(text, false, false);         

You can also search for the above on the Internet, but it will definitely not be that simple to use in the project.
This control is only responsible for converting Arabic characters. There are some problems with rich text, line wrapping, and mixed arrangement. Let’s take a look
at his questions
1. Line wrapping
2. Rich text tag
3. Mixed English and Arabic
. These three situations are not supported by the above plug-in, and we need to make changes to the plug-in. Each project team will make some changes in their own understanding, but no one has a unified solution. Here I introduce my solution.

I did not use the ArabicSupport plug-in here, but used the RTL TextMeshPro solution https://github.com/pnarimani/RTLTMPro . This solution helped us handle effects such as rich text tags and avoid duplication of work. Use this solution and then transplant its adaptation effect to UGUI.

Preparation

  1. Query characters according to unicode code
    https://symbl.cc/cn/

  2. Unicode character library
    https://www.fuhaoku.net/blocks

  3. RTL TextMeshPro
    https://github.com/pnarimani/RTLTMPro

  4. Check the real correct order of Arabic.
    Install the Word software. Do not use WPS. Do not use WPS . The word software is for the convenience of us to verify what Arabic should really look like.
    Open word, copy an Arabic sentence, and then select "" in the paste option. Keep only text". Then select "Paragraph->Right to Left Text Direction"
    Insert image description here
    Insert image description here

    In this state, the Arabic language you see is the truly correct Arabic language.
    Let me say this important thing three times: Do not believe that when you are debugging, or when you see strings in Notepad or IDE, when you actually run time, may not necessarily be the order you see! ! !
    IDEs, debuggers, or some notepad programs will all convert Arabic on their own, but the conversion results may not be correct! ! !

  5. Create a tool function to print the unicode code of the string, and then use the website https://symbl.cc/cn/ to query the corresponding characters based on the unicode code to see the real order.

public void PrintArabicUnicode(string text)
{
    
    
    StringBuilder sb=new StringBuilder();
    for (int i=0;i<text.Length;i++)
    {
    
    
        int unicode32CodePoint = char.ConvertToUtf32(text, i);
        sb.Append(unicode32CodePoint+" ");
    }
    Debug.Log(sb.ToString());
}
//注意在这个文档里,阿语的显示顺序已经发生变化了,其实我使用的还是上面的字符串
//string se = "name_new_group_hint=اسم المجموعة (مطلوب)";
//PrintArabicUnicode(se);
//输出结果
//110 97 109 101 95 110 101 119 95 103 114 111 117 112 95 104 105 110 116 61 1575 1587 1605 32 1575 1604 1605 1580 1605 1608 1593 1577 32 40 1605 1591 1604 1608 1576 41
比如说110,对应英文字母n

Insert image description here

How to adapt TextMeshPro

I use unity2018.4, TextMeshPro (TMP) version is 1.4.1, this version is slightly older, if you use a more advanced version of unity and TMP, you can also use it. After installing TMP, download RTL
TextMeshPro

  1. Open the download page https://github.com/pnarimani/RTLTMPro/releases , download v3.4.3.unitypackage , and import it into the project
    Insert image description here

  2. Modify the TMP source code.
    For versions above TMP 2.1, this step is not required. Just step 3. For
    versions below 2.1, the source code needs to be modified. The following is an introduction to how to modify the source code of TMP.
    The following is the official modification method given by the plug-in.
    Insert image description here

    If you don’t understand it, follow the steps below to modify it.

  3. First close unity, find the TMP source code in Libray/PackageCache, and then cut the entire folder to the Packages directory.
    Insert image description here
    Insert image description here

  4. Find the TMP_Text file, add the virtual keyword to the Text field, then open the RTLTextMeshPro.cs file, and add the override keyword to the Text field
    Insert image description here
    Insert image description here
    Insert image description here
    Insert image description here

  5. The entire folder directory using RTLTMPro
    is as follows. There are many test demos in Scene. You can take a look. In actual use, in order to save project resources, you can only keep the Scripts directory and delete other directories. When you right-click and use the menu, you can find
    Insert image description here
    many The Text-RTLTMP option is now available.
    Replace all TextMeshProUGUI scripts with RTL TextMeshPro.
    All text controls that require Arabic are replaced with RTL TextMeshPro.
    Insert image description hereInsert image description here
    Insert image description here

  6. Let’s explain RTL TextMeshPro. Compared with the native TextMeshPro, it has 4 more fields.

  • Farsi: After checking, the English numbers will become Persian numbers. Generally, there is no need to check.
  • ForceFix: This field is very important. Sometimes when Arabic and English are mixed, if the English letters are at the beginning of the sentence, the sentence is often treated as English. In this case, you can check this option to force the use of Arabic. words to process
  • Preserve
    Numbers: After checking, Arabic numerals will be displayed in the form of 123456789. It is recommended to check (for specific reasons, please refer to the above. Arabic has its own number form. If you want to display numbers in the form of 123456789, you need to check it. )
  • FixTags: When checked, TMP rich text tags are supported. It is recommended to check

7. Recommended usage in practice

Step 6 is the officially recommended usage. In actual applications, we found that there are still some inconveniences. For example, the check of force fix is ​​not smart enough. The above four options need to be selected every time. If it is not smart enough, let’s talk about it below. How do I use a new script to inherit TextMeshProUGUI so that all texts use FinalText? FinalText will determine whether the string contains Awen. If there is Awen, it will automatically call the repair script to process it.

using System.Collections;
using System.Collections.Generic;
using RTLTMPro;
using TMPro;
using UnityEngine;


public class FinalText : TextMeshProUGUI
{
    
    
    public override string text
    {
    
    
        get {
    
     return base.text; }
        set
        {
    
    
            string val = value;
            if (HasArabic(val))
            {
    
    
                isRightToLeftText = true;
                base.text = GetFixedText(val);
            }
            else
            {
    
    
                isRightToLeftText = false;
                base.text = val;
            }
        }
    }
    
    public static bool HasArabic(string str)
    {
    
    
        if (string.IsNullOrEmpty(str))
            return false;
        int strLength = str.Length;
        for (var i = 0; i < strLength; i++)
        {
    
    
            char c = str[i];
            if (isArabic(c))
            {
    
    
                return true;
            }
        }


        return false;
    }
    
    public static bool isArabic(char c)
    {
    
    
        if (c >= 0x600 && c <= 0x6ff) return true;
        if (c >= 0x750 && c <= 0x77f) return true;
        if (c >= 0xfb50 && c <= 0xfc3f) return true;
        if (c >= 0xfe70 && c <= 0xfefc) return true;
        return false;
    }
    
    protected readonly FastStringBuilder finalText = new FastStringBuilder(RTLSupport.DefaultBufferSize);


    private string GetFixedText(string input)
    {
    
    
        if (string.IsNullOrEmpty(input))
            return input;


        finalText.Clear();
        RTLSupport.FixRTL(input, finalText, false, true, true);
        finalText.Reverse();
        return finalText.ToString();
    }
}

At this point, the adaptation work of Arabic is basically over. RTLTMP Pro itself handles issues such as line breaks, rich text tags, mixed English and Arabic, etc.

How to adapt UGUI

The above plug-in only supports TMP and does not support UGUI. After checking the source code, we found that the reason for not supporting it is mainly because UGUI does not support the RTL function, that is, rendering from right to left.
So if UGUI supports this function, can UGUI also be used? With the idea of ​​giving it a try, I made the following modifications to allow the UGUI vertices to transpose the order of the vertices when rendering.

Guess you like

Origin blog.csdn.net/weixin_42562717/article/details/129417695