深入学习java源码之Character.Subset与Character.UnicodeBlock

hashMap的加载因子

new HashMap<>(128);

    public HashMap(int initialCapacity) {
        this(initialCapacity, DEFAULT_LOAD_FACTOR);
    }

加载因子 loadfactor

     /**
     * 默认的初始化的容量，必须是2的幂次数<br>
     * The default initial capacity - MUST be a power of two.
     */
    static final int DEFAULT_INITIAL_CAPACITY = 16;

    /**
     * 默认的加载因子
     */
    static final float DEFAULT_LOAD_FACTOR = 0.75f;

    /**
     * 阈值。等于容量乘以加载因子。<br>
     * 也就是说，一旦容量到了这个数值，HashMap将会扩容。
     * The next size value at which to resize (capacity * load factor).
     * @serial
     */
    int threshold;

加载因子越高空间利用率提高了但是查询时间和添加时间增加

加载因子 loadfactor 是表示 Hsah 表中元素的填满的程度.若:加载因子越大,填满的元素越多,好处是,空间利用率高了,但:冲突的机会加大了.反之,加载因子越小,填满的元素越少,好处是:冲突的机会减小了,但:空间浪费多了.

冲突的机会越大,则查找的成本越高.反之,查找的成本越小.因而,查找时间就越小.

因此,必须在 “冲突的机会”与”空间利用率”之间寻找一种平衡与折衷. 这种平衡与折衷本质上是数据结构中有名的”时-空”矛盾的平衡与折衷.
默认的容量是 16，而 threshold 是 16*0.75 = 12;

hashmap 是这样存的

先利用hashcode 找到需要存的地方
但是存的地方肯定是有限的就是hashMap分配到的空间比如是 10
现在你第一个元素来了那么他会根据你 hashcode%10 得到你在 10个位置中该存到哪里

这个时候就有一个问题，就是，如果hashcode%10 找到存的地方   当你要存进去时候你发现里面已经有另外一个对象了，
那么这时候就要调用 equals方法进行比较，如果相同，就说明是一个相同的对象。就替换掉。
如果不同，那么就形成散列桶，  就是2个对象一起，不过有先后，后进来的在后面。

hashmap 查询对象，要的是效率，直接通过hashcode找到存放的地址，直接取出，只需一次。
但是像我们前面说的这种情况，是会让操作数增加的，
你找到了 hashcode 所对应的物理地址，发现里面有2个对象，这时就不能确定那个是你要找的，那么就要通过equals和你传入的key进行比对，相同则返回。

前面的讲述已经发现  当你空间只有仅仅为10的时候是很容易造成，2个对象的hashcode 所对应的地址是一个位置的情况
这样就造成 2个对象会形成散列桶，使查询和插入的时间增加。

这时就有一个加载因子的参数，如果加载因子为0.75 ，如果你hashmap的空间有 100  那么  当你插入了75个元素的时候 hashmap就需要扩容了，不然的话会形成很长散列桶，对于查询和插入都会增加时间，因为他要一个一个的equals。
但是你又不能让加载因子很小，0.01 这样是不合适的，因为他会大大消耗你的内存，你一加入一个对象hashmap就扩容。

java的enum枚举

原始的接口定义常量

public interface IConstants {
    String MON = "Mon";
    String TUE = "Tue";
    String WED = "Wed";
    String THU = "Thu";
    String FRI = "Fri";
    String SAT = "Sat";
    String SUN = "Sun";
}

创建枚举类型要使用 enum 关键字，隐含了所创建的类型都是 java.lang.Enum 类的子类（java.lang.Enum 是一个抽象类）。枚举类型符合通用模式 Class Enum<E extends Enum<E>>，而 E 表示枚举类型的名称。枚举类型的每一个值都将映射到 protected Enum(String name, int ordinal) 构造函数中，在这里，每个值的名称都被转换成一个字符串，并且序数设置表示了此设置被创建的顺序。

public enum EnumTest {
    MON, TUE, WED, THU, FRI, SAT, SUN;
}

这段代码实际上调用了7次 Enum(String name, int ordinal)：

new Enum<EnumTest>("MON",0);
new Enum<EnumTest>("TUE",1);
new Enum<EnumTest>("WED",2);
    ... ...

public class Test {
    public static void main(String[] args) {
        for (EnumTest e : EnumTest.values()) {
            System.out.println(e.toString());
        }
         
        System.out.println("----------------我是分隔线------------------");
         
        EnumTest test = EnumTest.TUE;
        switch (test) {
        case MON:
            System.out.println("今天是星期一");
            break;
        case TUE:
            System.out.println("今天是星期二");
            break;
        // ... ...
        default:
            System.out.println(test);
            break;
        }
    }
}


MON
TUE
WED
THU
FRI
SAT
SUN
----------------我是分隔线------------------
今天是星期二

可以把 enum 看成是一个普通的 class，它们都可以定义一些属性和方法，不同之处是：enum 不能使用 extends 关键字继承其他类，因为 enum 已经继承了 java.lang.Enum（java是单一继承）。

enum 对象的常用方法介绍

int compareTo(E o)
比较此枚举与指定对象的顺序。

Class<E> getDeclaringClass()
返回与此枚举常量的枚举类型相对应的 Class 对象。

String name()
返回此枚举常量的名称，在其枚举声明中对其进行声明。

int ordinal()
返回枚举常量的序数（它在枚举声明中的位置，其中初始常量序数为零）。

String toString()

返回枚举常量的名称，它包含在声明中。

static <T extends Enum<T>> T valueOf(Class<T> enumType, String name)
返回带指定名称的指定枚举类型的枚举常量。

public class Test {
    public static void main(String[] args) {
        EnumTest test = EnumTest.TUE;
         
        //compareTo(E o)
        switch (test.compareTo(EnumTest.MON)) {
        case -1:
            System.out.println("TUE 在 MON 之前");
            break;
        case 1:
            System.out.println("TUE 在 MON 之后");
            break;
        default:
            System.out.println("TUE 与 MON 在同一位置");
            break;
        }
         
        //getDeclaringClass()
        System.out.println("getDeclaringClass(): " + test.getDeclaringClass().getName());
         
        //name() 和  toString()
        System.out.println("name(): " + test.name());
        System.out.println("toString(): " + test.toString());
         
        //ordinal()， 返回值是从 0 开始
        System.out.println("ordinal(): " + test.ordinal());
    }
}



TUE 在 MON 之后
getDeclaringClass(): com.test.EnumTest
name(): TUE
toString(): TUE
ordinal(): 1

Modifier and Type	Method and Description
`static int`	`charCount(int codePoint)` 确定代表指定字符（Unicode代码点）所需的 `char`值。
`char`	`charValue()` 返回此 `Character`对象的值。
`static int`	`codePointAt(char[] a, int index)` 返回 `char`数组的给定索引处的代码点。
`static int`	`codePointAt(char[] a, int index, int limit)` 返回 `char`数组的给定索引处的代码点，其中只能使用 `index`小于 `limit`数组元素。
`static int`	`codePointAt(CharSequence seq, int index)` 返回 `CharSequence`给定索引处的代码点。
`static int`	`codePointBefore(char[] a, int index)` 返回 `char`阵列给定索引之前的代码点。
`static int`	`codePointBefore(char[] a, int index, int start)` 返回 `char`阵列给定索引之前的代码点，只能使用 `index`大于等于 `start`数组元素。
`static int`	`codePointBefore(CharSequence seq, int index)` 返回的给定索引前面的代码点 `CharSequence` 。
`static int`	`codePointCount(char[] a, int offset, int count)` 返回 `char`数组参数的子阵列中的Unicode代码点数。
`static int`	`codePointCount(CharSequence seq, int beginIndex, int endIndex)` 返回指定字符序列的文本范围内的Unicode代码点数。
`static int`	`compare(char x, char y)` 数值比较两个 `char`数值。
`int`	`compareTo(Character anotherCharacter)` 数字比较两个 `Character`对象。
`static int`	`digit(char ch, int radix)` 返回指定基数中字符 `ch`的数值。
`static int`	`digit(int codePoint, int radix)` 返回指定基数中指定字符（Unicode代码点）的数值。
`boolean`	`equals(Object obj)` 将此对象与指定对象进行比较。
`static char`	`forDigit(int digit, int radix)` 确定指定基数中特定数字的字符表示。
`static byte`	`getDirectionality(char ch)` 返回给定字符的Unicode方向属性。
`static byte`	`getDirectionality(int codePoint)` 返回给定字符的Unicode方向性属性（Unicode代码点）。
`static String`	`getName(int codePoint)` 返回指定字符的Unicode名称 `codePoint` ，或者如果代码点是空 `unassigned` 。
`static int`	`getNumericValue(char ch)` 返回指定的Unicode字符代表的 `int`值。
`static int`	`getNumericValue(int codePoint)` 返回 `int`值指定字符（Unicode代码点）表示。
`static int`	`getType(char ch)` 返回一个值，表示一个字符的一般类别。
`static int`	`getType(int codePoint)` 返回一个值，表示一个字符的一般类别。
`int`	`hashCode()` 返回这个`Character`的哈希码; 等于调用`charValue()`的结果。
`static int`	`hashCode(char value)` 返回一个`char`值的哈希码; 兼容`Character.hashCode()` 。
`static char`	`highSurrogate(int codePoint)` 返回主导替代（一个 high surrogate code unit所述的） surrogate pair表示在UTF-16编码指定的补充的字符（Unicode代码点）。
`static boolean`	`isAlphabetic(int codePoint)` 确定指定的字符（Unicode代码点）是否是字母表。
`static boolean`	`isBmpCodePoint(int codePoint)` 确定指定的字符（Unicode代码点）是否在 Basic Multilingual Plane (BMP)中。
`static boolean`	`isDefined(char ch)` 确定字符是否以Unicode定义。
`static boolean`	`isDefined(int codePoint)` 确定Unicode中是否定义了一个字符（Unicode代码点）。
`static boolean`	`isDigit(char ch)` 确定指定的字符是否是数字。
`static boolean`	`isDigit(int codePoint)` 确定指定的字符（Unicode代码点）是否为数字。
`static boolean`	`isHighSurrogate(char ch)` 确定给定的 `char`值是否为 Unicode high-surrogate code unit （也称为引导代理单元）。
`static boolean`	`isIdentifierIgnorable(char ch)` 确定指定的字符是否应被视为Java标识符或Unicode标识符中的可忽略字符。
`static boolean`	`isIdentifierIgnorable(int codePoint)` 确定指定字符（Unicode代码点）是否应被视为Java标识符或Unicode标识符中的可忽略字符。
`static boolean`	`isIdeographic(int codePoint)` 确定指定字符（Unicode代码点）是否是Unicode标准定义的CJKV（中文，日文，韩文和越南文）表意文字。
`static boolean`	`isISOControl(char ch)` 确定指定的字符是否是ISO控制字符。
`static boolean`	`isISOControl(int codePoint)` 确定引用的字符（Unicode代码点）是否是ISO控制字符。
`static boolean`	`isJavaIdentifierPart(char ch)` 确定指定的字符是否可以是Java标识符的一部分，而不是第一个字符。
`static boolean`	`isJavaIdentifierPart(int codePoint)` 确定字符（Unicode代码点）可能是Java标识符的一部分，而不是第一个字符。
`static boolean`	`isJavaIdentifierStart(char ch)` 确定指定字符是否允许作为Java标识符中的第一个字符。
`static boolean`	`isJavaIdentifierStart(int codePoint)` 确定字符（Unicode代码点）是否允许作为Java标识符中的第一个字符。
`static boolean`	`isJavaLetter(char ch)`已弃用替换为isJavaIdentifierStart（char）。
`static boolean`	`isJavaLetterOrDigit(char ch)`已弃用由isJavaIdentifierPart（char）替代。
`static boolean`	`isLetter(char ch)` 确定指定的字符是否是一个字母。
`static boolean`	`isLetter(int codePoint)` 确定指定的字符（Unicode代码点）是否是一个字母。
`static boolean`	`isLetterOrDigit(char ch)` 确定指定的字符是字母还是数字。
`static boolean`	`isLetterOrDigit(int codePoint)` 确定指定的字符（Unicode代码点）是字母还是数字。
`static boolean`	`isLowerCase(char ch)` 确定指定的字符是否是小写字符。
`static boolean`	`isLowerCase(int codePoint)` 确定指定的字符（Unicode代码点）是否是小写字符。
`static boolean`	`isLowSurrogate(char ch)` 确定给定的 `char`值是否为 Unicode low-surrogate code unit （也称为尾随代理单元）。
`static boolean`	`isMirrored(char ch)` 根据Unicode规范确定字符是否镜像。
`static boolean`	`isMirrored(int codePoint)` 确定是否根据Unicode规范镜像指定的字符（Unicode代码点）。
`static boolean`	`isSpace(char ch)`已弃用替换为isWhitespace（char）。
`static boolean`	`isSpaceChar(char ch)` 确定指定的字符是否是Unicode空格字符。
`static boolean`	`isSpaceChar(int codePoint)` 确定指定字符（Unicode代码点）是否为Unicode空格字符。
`static boolean`	`isSupplementaryCodePoint(int codePoint)` 确定指定字符（Unicode代码点）是否在 supplementary character范围内。
`static boolean`	`isSurrogate(char ch)` 确定给定的 `char`值是否是Unicode 代理代码单元。
`static boolean`	`isSurrogatePair(char high, char low)` 确定指定的一对 `char`值是否有效 Unicode surrogate pair 。
`static boolean`	`isTitleCase(char ch)` 确定指定的字符是否是一个titlecase字符。
`static boolean`	`isTitleCase(int codePoint)` 确定指定的字符（Unicode代码点）是否是一个titlecase字符。
`static boolean`	`isUnicodeIdentifierPart(char ch)` 确定指定的字符是否可以是Unicode标识符的一部分，而不是第一个字符。
`static boolean`	`isUnicodeIdentifierPart(int codePoint)` 确定指定的字符（Unicode代码点）是否可能是Unicode标识符的一部分，而不是第一个字符。
`static boolean`	`isUnicodeIdentifierStart(char ch)` 确定指定字符是否允许为Unicode标识符中的第一个字符。
`static boolean`	`isUnicodeIdentifierStart(int codePoint)` 确定Unicode标识符中的第一个字符是否允许指定的字符（Unicode代码点）。
`static boolean`	`isUpperCase(char ch)` 确定指定的字符是否为大写字符。
`static boolean`	`isUpperCase(int codePoint)` 确定指定的字符（Unicode代码点）是否为大写字符。
`static boolean`	`isValidCodePoint(int codePoint)` 确定指定的代码点是否有效 Unicode code point value 。
`static boolean`	`isWhitespace(char ch)` 根据Java确定指定的字符是否为空格。
`static boolean`	`isWhitespace(int codePoint)` 根据Java确定指定字符（Unicode代码点）是否为空格。
`static char`	`lowSurrogate(int codePoint)` 返回尾随替代（一个 low surrogate code unit所述的） surrogate pair表示在UTF-16编码指定的补充的字符（Unicode代码点）。
`static int`	`offsetByCodePoints(char[] a, int start, int count, int index, int codePointOffset)` 返回给定的 `char`子阵列中的索引，该子阵列与 `index`由 `codePointOffset`代码点偏移。
`static int`	`offsetByCodePoints(CharSequence seq, int index, int codePointOffset)` 返回给定的char序列中与 `index` （ `codePointOffset`代码点偏移的索引。
`static char`	`reverseBytes(char ch)` 返回通过反转指定的 char值中的字节顺序获得的值。
`static char[]`	`toChars(int codePoint)` 将指定的字符（Unicode代码点）转换为存储在 `char`数组中的UTF-16 `char`形式。
`static int`	`toChars(int codePoint, char[] dst, int dstIndex)` 将指定的字符（Unicode代码点）转换为其UTF-16表示形式。
`static int`	`toCodePoint(char high, char low)` 将指定的代理对转换为其补充代码点值。
`static char`	`toLowerCase(char ch)` 使用UnicodeData文件中的大小写映射信息将字符参数转换为小写。
`static int`	`toLowerCase(int codePoint)` 使用UnicodeData文件中的大小写映射信息将字符（Unicode代码点）参数转换为小写。
`String`	`toString()` 返回 `String`表示此对象 `Character`的价值。
`static String`	`toString(char c)` 返回一个 `String`对象，表示指定的 `char` 。
`static char`	`toTitleCase(char ch)` 使用UnicodeData文件中的案例映射信息将字符参数转换为titlecase。
`static int`	`toTitleCase(int codePoint)` 使用UnicodeData文件中的案例映射信息将字符（Unicode代码点）参数转换为titlecase。
`static char`	`toUpperCase(char ch)` 使用UnicodeData文件中的案例映射信息将字符参数转换为大写。
`static int`	`toUpperCase(int codePoint)` 使用UnicodeData文件中的案例映射信息将字符（Unicode代码点）参数转换为大写。
`static Character`	`valueOf(char c)` 返回一个表示指定的 char值的 Character实例。

java源码

package java.lang;

import java.util.Arrays;
import java.util.Map;
import java.util.HashMap;
import java.util.Locale;


public final
class Character implements java.io.Serializable, Comparable<Character> {

    public static final int MIN_RADIX = 2;

    public static final int MAX_RADIX = 36;
	
    public static final char MIN_VALUE = '\u0000';	
	
    public static final char MAX_VALUE = '\uFFFF';
	
    @SuppressWarnings("unchecked")
    public static final Class<Character> TYPE = (Class<Character>) Class.getPrimitiveClass("char");

    public static final byte UNASSIGNED = 0;

    public static final byte UPPERCASE_LETTER = 1;	

    public static final byte LOWERCASE_LETTER = 2;
	
	......
	
    public static final byte DECIMAL_DIGIT_NUMBER = 9;
	
    static final int ERROR = 0xFFFFFFFF;
	
    public static final byte DIRECTIONALITY_UNDEFINED = -1;	
	
    public static final char MIN_HIGH_SURROGATE = '\uD800';
	
    public static final char MIN_LOW_SURROGATE  = '\uDC00';
	
    public static final char MIN_SURROGATE = MIN_HIGH_SURROGATE;
	
    public static final char MAX_SURROGATE = MAX_LOW_SURROGATE;
	
    public static final int MIN_SUPPLEMENTARY_CODE_POINT = 0x010000;

    private final char value;

    private static final long serialVersionUID = 3786198910865385080L;
	
    public Character(char value) {
        this.value = value;
    }

    private static class CharacterCache {
        private CharacterCache(){}

        static final Character cache[] = new Character[127 + 1];

        static {
            for (int i = 0; i < cache.length; i++)
                cache[i] = new Character((char)i);
        }
    }

    public static Character valueOf(char c) {
        if (c <= 127) { // must cache
            return CharacterCache.cache[(int)c];
        }
        return new Character(c);
    }

    public char charValue() {
        return value;
    }
	
    @Override
    public int hashCode() {
        return Character.hashCode(value);
    }	
	
    public static int hashCode(char value) {
        return (int)value;
    }

    public boolean equals(Object obj) {
        if (obj instanceof Character) {
            return value == ((Character)obj).charValue();
        }
        return false;
    }
	
    public String toString() {
        char buf[] = {value};
        return String.valueOf(buf);
    }

    public static String toString(char c) {
        return String.valueOf(c);
    }
	
    public static int getType(char ch) {
        return getType((int)ch);
    }

    public static int getType(int codePoint) {
        return CharacterData.of(codePoint).getType(codePoint);
    }
	
    public static int compare(char x, char y) {
        return x - y;
    }	
	
    public int compareTo(Character anotherCharacter) {
        return compare(this.value, anotherCharacter.value);
    }	
	
    public static final int BYTES = SIZE / Byte.SIZE;	
	
    public static final int SIZE = 16;	
	
    public static String getName(int codePoint) {
        if (!isValidCodePoint(codePoint)) {
            throw new IllegalArgumentException();
        }
        String name = CharacterName.get(codePoint);
        if (name != null)
            return name;
        if (getType(codePoint) == UNASSIGNED)
            return null;
        UnicodeBlock block = UnicodeBlock.of(codePoint);
        if (block != null)
            return block.toString().replace('_', ' ') + " "
                   + Integer.toHexString(codePoint).toUpperCase(Locale.ENGLISH);
        // should never come here
        return Integer.toHexString(codePoint).toUpperCase(Locale.ENGLISH);
    }
	
    public static class Subset  {

        private String name;	
	
        protected Subset(String name) {
            if (name == null) {
                throw new NullPointerException("name");
            }
            this.name = name;
        }
        public final boolean equals(Object obj) {
            return (this == obj);
        }	
        public final int hashCode() {
            return super.hashCode();
        }	
        public final String toString() {
            return name;
        }
    }
	
    public static final class UnicodeBlock extends Subset {

        private static Map<String, UnicodeBlock> map = new HashMap<>(256);

        /**
         * Creates a UnicodeBlock with the given identifier name.
         * This name must be the same as the block identifier.
         */
        private UnicodeBlock(String idName) {
            super(idName);
            map.put(idName, this);
        }

        private UnicodeBlock(String idName, String alias) {
            this(idName);
            map.put(alias, this);
        }

        private UnicodeBlock(String idName, String... aliases) {
            this(idName);
            for (String alias : aliases)
                map.put(alias, this);
        }

        public static final UnicodeBlock  BASIC_LATIN =
            new UnicodeBlock("BASIC_LATIN",
                             "BASIC LATIN",
                             "BASICLATIN");		
							 
        public static final UnicodeBlock ARMENIAN =
            new UnicodeBlock("ARMENIAN");		

        public static final UnicodeBlock PHAGS_PA =
            new UnicodeBlock("PHAGS_PA",
                             "PHAGS-PA");
							 
        private static final int blockStarts[] = {
            0x0000,   // 0000..007F; Basic Latin
            0x0080,   // 0080..00FF; Latin-1 Supplement
            0x0100,   // 0100..017F; Latin Extended-A
            0x0180,   // 0180..024F; Latin Extended-B
            0x0250,   // 0250..02AF; IPA Extensions
		};

        private static final UnicodeBlock[] blocks = {
            BASIC_LATIN,
            LATIN_1_SUPPLEMENT,
            LATIN_EXTENDED_A,
            LATIN_EXTENDED_B,							 
        };	
		
        public static UnicodeBlock of(char c) {
            return of((int)c);
        }

        public static UnicodeBlock of(int codePoint) {
            if (!isValidCodePoint(codePoint)) {
                throw new IllegalArgumentException();
            }

            int top, bottom, current;
            bottom = 0;
            top = blockStarts.length;
            current = top/2;

            // invariant: top > current >= bottom && codePoint >= unicodeBlockStarts[bottom]
            while (top - bottom > 1) {
                if (codePoint >= blockStarts[current]) {
                    bottom = current;
                } else {
                    top = current;
                }
                current = (top + bottom) / 2;
            }
            return blocks[current];
        }	

        public static final UnicodeBlock forName(String blockName) {
            UnicodeBlock block = map.get(blockName.toUpperCase(Locale.US));
            if (block == null) {
                throw new IllegalArgumentException();
            }
            return block;
        }		
}							 

    public static enum UnicodeScript {
        /**
         * Unicode script "Common".
         */
        COMMON,

        /**
         * Unicode script "Latin".
         */
        LATIN,

        /**
         * Unicode script "Greek".
         */
        GREEK,
		/**
         * Unicode script "Takri".
         */
        TAKRI,

        /**
         * Unicode script "Miao".
         */
        MIAO,

        /**
         * Unicode script "Unknown".
         */
        UNKNOWN;

        private static final int[] scriptStarts = {
            0x0000,   // 0000..0040; COMMON
            0x0041,   // 0041..005A; LATIN
            0x005B,   // 005B..0060; COMMON
            0x0061,   // 0061..007A; LATIN
            0x20000,  // 20000..E0000; HAN
            0xE0001,  // E0001..E00FF; COMMON
            0xE0100,  // E0100..E01EF; INHERITED
            0xE01F0   // E01F0..10FFFF; UNKNOWN
        };

        private static final UnicodeScript[] scripts = {
            COMMON,
            LATIN,
            COMMON,
            LATIN,
            COMMON,	
            INHERITED,
            UNKNOWN
        };
		
        private static HashMap<String, Character.UnicodeScript> aliases;
        static {
            aliases = new HashMap<>(128);
            aliases.put("ARAB", ARABIC);
            aliases.put("ZINH", INHERITED);
            aliases.put("ZYYY", COMMON);
            aliases.put("ZZZZ", UNKNOWN);
        }	
	
        public static UnicodeScript of(int codePoint) {
            if (!isValidCodePoint(codePoint))
                throw new IllegalArgumentException();
            int type = getType(codePoint);
            // leave SURROGATE and PRIVATE_USE for table lookup
            if (type == UNASSIGNED)
                return UNKNOWN;
            int index = Arrays.binarySearch(scriptStarts, codePoint);
            if (index < 0)
                index = -index - 2;
            return scripts[index];
        }
	
        public static final UnicodeScript forName(String scriptName) {
            scriptName = scriptName.toUpperCase(Locale.ENGLISH);
                                 //.replace(' ', '_'));
            UnicodeScript sc = aliases.get(scriptName);
            if (sc != null)
                return sc;
            return valueOf(scriptName);
        }
    }
	
    public static boolean isJavaIdentifierStart(char ch) {
        return isJavaIdentifierStart((int)ch);
    }	
	
    public static boolean isJavaIdentifierStart(int codePoint) {
        return CharacterData.of(codePoint).isJavaIdentifierStart(codePoint);
    }

    public static boolean isJavaIdentifierPart(char ch) {
        return isJavaIdentifierPart((int)ch);
    }
	
    public static boolean isJavaIdentifierPart(int codePoint) {
        return CharacterData.of(codePoint).isJavaIdentifierPart(codePoint);
    }
	
    public static boolean isUnicodeIdentifierStart(char ch) {
        return isUnicodeIdentifierStart((int)ch);
    }

    public static boolean isUnicodeIdentifierStart(int codePoint) {
        return CharacterData.of(codePoint).isUnicodeIdentifierStart(codePoint);
    }

    public static boolean isUnicodeIdentifierPart(char ch) {
        return isUnicodeIdentifierPart((int)ch);
    }

    public static boolean isUnicodeIdentifierPart(int codePoint) {
        return CharacterData.of(codePoint).isUnicodeIdentifierPart(codePoint);
    }

    public static boolean isIdentifierIgnorable(char ch) {
        return isIdentifierIgnorable((int)ch);
    }
    public static boolean isIdentifierIgnorable(int codePoint) {
        return CharacterData.of(codePoint).isIdentifierIgnorable(codePoint);
    }
	
    public static char toLowerCase(char ch) {
        return (char)toLowerCase((int)ch);
    }

    public static int toLowerCase(int codePoint) {
        return CharacterData.of(codePoint).toLowerCase(codePoint);
    }	

    public static char toUpperCase(char ch) {
        return (char)toUpperCase((int)ch);
    }

   public static int toUpperCase(int codePoint) {
        return CharacterData.of(codePoint).toUpperCase(codePoint);
    }	
	
    public static char toTitleCase(char ch) {
        return (char)toTitleCase((int)ch);
    }	
	
    public static int toTitleCase(int codePoint) {
        return CharacterData.of(codePoint).toTitleCase(codePoint);
    }
	
    public static int digit(char ch, int radix) {
        return digit((int)ch, radix);
    }	
	
    public static int digit(int codePoint, int radix) {
        return CharacterData.of(codePoint).digit(codePoint, radix);
    }
	
    public static int getNumericValue(char ch) {
        return getNumericValue((int)ch);
    }
	
    public static int getNumericValue(int codePoint) {
        return CharacterData.of(codePoint).getNumericValue(codePoint);
    }
	
    @Deprecated
    public static boolean isSpace(char ch) {
        return (ch <= 0x0020) &&
            (((((1L << 0x0009) |
            (1L << 0x000A) |
            (1L << 0x000C) |
            (1L << 0x000D) |
            (1L << 0x0020)) >> ch) & 1L) != 0);
    }
	
    public static boolean isSpaceChar(char ch) {
        return isSpaceChar((int)ch);
    }
	
    public static boolean isSpaceChar(int codePoint) {
        return ((((1 << Character.SPACE_SEPARATOR) |
                  (1 << Character.LINE_SEPARATOR) |
                  (1 << Character.PARAGRAPH_SEPARATOR)) >> getType(codePoint)) & 1)
            != 0;
    }
	
    public static boolean isWhitespace(char ch) {
        return isWhitespace((int)ch);
    }
	
    public static boolean isWhitespace(int codePoint) {
        return CharacterData.of(codePoint).isWhitespace(codePoint);
    }

    public static boolean isISOControl(char ch) {
        return isISOControl((int)ch);
    }
	
    public static boolean isISOControl(int codePoint) {
        // Optimized form of:
        //     (codePoint >= 0x00 && codePoint <= 0x1F) ||
        //     (codePoint >= 0x7F && codePoint <= 0x9F);
        return codePoint <= 0x9F &&
            (codePoint >= 0x7F || (codePoint >>> 5 == 0));
    }
	
    public static char forDigit(int digit, int radix) {
        if ((digit >= radix) || (digit < 0)) {
            return '\0';
        }
        if ((radix < Character.MIN_RADIX) || (radix > Character.MAX_RADIX)) {
            return '\0';
        }
        if (digit < 10) {
            return (char)('0' + digit);
        }
        return (char)('a' - 10 + digit);
    }	
	
    public static byte getDirectionality(char ch) {
        return getDirectionality((int)ch);
    }
	
    public static byte getDirectionality(int codePoint) {
        return CharacterData.of(codePoint).getDirectionality(codePoint);
    }

    public static boolean isMirrored(char ch) {
        return isMirrored((int)ch);
    }	
	
    public static boolean isMirrored(int codePoint) {
        return CharacterData.of(codePoint).isMirrored(codePoint);
    }
	
    static int toUpperCaseEx(int codePoint) {
        assert isValidCodePoint(codePoint);
        return CharacterData.of(codePoint).toUpperCaseEx(codePoint);
    }
	
    static char[] toUpperCaseCharArray(int codePoint) {
        // As of Unicode 6.0, 1:M uppercasings only happen in the BMP.
        assert isBmpCodePoint(codePoint);
        return CharacterData.of(codePoint).toUpperCaseCharArray(codePoint);
    }

    public static char reverseBytes(char ch) {
        return (char) (((ch & 0xFF00) >> 8) | (ch << 8));
    }
}

package java.lang;

abstract class CharacterData {
    abstract int getProperties(int ch);
    abstract int getType(int ch);
    abstract boolean isWhitespace(int ch);
    abstract boolean isMirrored(int ch);
    abstract boolean isJavaIdentifierStart(int ch);
    abstract boolean isJavaIdentifierPart(int ch);
    abstract boolean isUnicodeIdentifierStart(int ch);
    abstract boolean isUnicodeIdentifierPart(int ch);
    abstract boolean isIdentifierIgnorable(int ch);
    abstract int toLowerCase(int ch);
    abstract int toUpperCase(int ch);
    abstract int toTitleCase(int ch);
    abstract int digit(int ch, int radix);
    abstract int getNumericValue(int ch);
    abstract byte getDirectionality(int ch);

    //need to implement for JSR204
    int toUpperCaseEx(int ch) {
        return toUpperCase(ch);
    }

    char[] toUpperCaseCharArray(int ch) {
        return null;
    }

    boolean isOtherLowercase(int ch) {
        return false;
    }

    boolean isOtherUppercase(int ch) {
        return false;
    }

    boolean isOtherAlphabetic(int ch) {
        return false;
    }

    boolean isIdeographic(int ch) {
        return false;
    }

    // Character <= 0xff (basic latin) is handled by internal fast-path
    // to avoid initializing large tables.
    // Note: performance of this "fast-path" code may be sub-optimal
    // in negative cases for some accessors due to complicated ranges.
    // Should revisit after optimization of table initialization.

    static final CharacterData of(int ch) {
        if (ch >>> 8 == 0) {     // fast-path
            return CharacterDataLatin1.instance;
        } else {
            switch(ch >>> 16) {  //plane 00-16
            case(0):
                return CharacterData00.instance;
            case(1):
                return CharacterData01.instance;
            case(2):
                return CharacterData02.instance;
            case(14):
                return CharacterData0E.instance;
            case(15):   // Private Use
            case(16):   // Private Use
                return CharacterDataPrivateUse.instance;
            default:
                return CharacterDataUndefined.instance;
            }
        }
    }
}


class CharacterData00 extends CharacterData {
    int getProperties(int ch) {
        char offset = (char)ch;
        int props = A[Y[X[offset>>5]|((offset>>1)&0xF)]|(offset&0x1)];
        return props;
    }

    int getPropertiesEx(int ch) {
        char offset = (char)ch;
        int props = B[Y[X[offset>>5]|((offset>>1)&0xF)]|(offset&0x1)];
        return props;
    }

    int getType(int ch) {
        int props = getProperties(ch);
        return (props & 0x1F);
    }
	
    boolean isJavaIdentifierPart(int ch) {
        int props = getProperties(ch);
        return ((props & 0x00003000) != 0);
    }

    boolean isUnicodeIdentifierStart(int ch) {
        int props = getProperties(ch);
        return ((props & 0x00007000) == 0x00007000);
    }

    boolean isUnicodeIdentifierPart(int ch) {
        int props = getProperties(ch);
        return ((props & 0x00001000) != 0);
    }
	
    int toLowerCase(int ch) {
        int mapChar = ch;
        int val = getProperties(ch);

        if ((val & 0x00020000) != 0) {
          if ((val & 0x07FC0000) == 0x07FC0000) {
            switch(ch) {
              // map the offset overflow chars
            case 0x0130 : mapChar = 0x0069; break;
            case 0x2126 : mapChar = 0x03C9; break;
            case 0x212A : mapChar = 0x006B; break;
            case 0x212B : mapChar = 0x00E5; break;
            case 0xA78D : mapChar = 0x0265; break;
            case 0xA7AA : mapChar = 0x0266; break;
              // default mapChar is already set, so no
              // need to redo it here.
              // default       : mapChar = ch;
            }
          }
          else {
            int offset = val << 5 >> (5+18);
            mapChar = ch + offset;
          }
        }
        return mapChar;
    }	

    static {
            charMap = new char[][][] {
        { {'\u00DF'}, {'\u0053', '\u0053', } },
        { {'\u0130'}, {'\u0130', } },
        { {'\u0149'}, {'\u02BC', '\u004E', } },
        { {'\uFB13'}, {'\u0544', '\u0546', } },
        { {'\uFB14'}, {'\u0544', '\u0535', } },
        { {'\uFB15'}, {'\u0544', '\u053B', } },
        { {'\uFB16'}, {'\u054E', '\u0546', } },
        { {'\uFB17'}, {'\u0544', '\u053D', } },
    };
        { // THIS CODE WAS AUTOMATICALLY CREATED BY GenerateCharacter:
            char[] data = A_DATA.toCharArray();
            assert (data.length == (930 * 2));
            int i = 0, j = 0;
            while (i < (930 * 2)) {
                int entry = data[i++] << 16;
                A[j++] = entry | data[i++];
            }
        }

    }        
}

package java.lang;

/** The CharacterData class encapsulates the large tables found in
    Java.lang.Character. */

class CharacterDataPrivateUse extends CharacterData {

    int getProperties(int ch) {
        return 0;
    }

    int getType(int ch) {
	return (ch & 0xFFFE) == 0xFFFE
	    ? Character.UNASSIGNED
	    : Character.PRIVATE_USE;
    }

    boolean isJavaIdentifierStart(int ch) {
		return false;
    }

    boolean isJavaIdentifierPart(int ch) {
		return false;
    }

    boolean isUnicodeIdentifierStart(int ch) {
		return false;
    }

    boolean isUnicodeIdentifierPart(int ch) {
		return false;
    }

    boolean isIdentifierIgnorable(int ch) {
		return false;
    }

    int toLowerCase(int ch) {
		return ch;
    }

    int toUpperCase(int ch) {
		return ch;
    }

    int toTitleCase(int ch) {
		return ch;
    }

    int digit(int ch, int radix) {
		return -1;
    }

    int getNumericValue(int ch) {
		return -1;
    }

    boolean isWhitespace(int ch) {
		return false;
    }

    byte getDirectionality(int ch) {
	return (ch & 0xFFFE) == 0xFFFE
	    ? Character.DIRECTIONALITY_UNDEFINED
	    : Character.DIRECTIONALITY_LEFT_TO_RIGHT;
    }

    boolean isMirrored(int ch) {
		return false;
    }

    static final CharacterData instance = new CharacterDataPrivateUse();
    private CharacterDataPrivateUse() {};
}

package java.lang;

/** The CharacterData class encapsulates the large tables found in
    Java.lang.Character. */

class CharacterDataUndefined extends CharacterData {

    int getProperties(int ch) {
        return 0;
    }

    int getType(int ch) {
	return Character.UNASSIGNED;
    }

    boolean isJavaIdentifierStart(int ch) {
		return false;
    }

    boolean isJavaIdentifierPart(int ch) {
		return false;
    }

    boolean isUnicodeIdentifierStart(int ch) {
		return false;
    }

    boolean isUnicodeIdentifierPart(int ch) {
		return false;
    }

    boolean isIdentifierIgnorable(int ch) {
		return false;
    }

    int toLowerCase(int ch) {
		return ch;
    }

    int toUpperCase(int ch) {
		return ch;
    }

    int toTitleCase(int ch) {
		return ch;
    }

    int digit(int ch, int radix) {
		return -1;
    }

    int getNumericValue(int ch) {
		return -1;
    }

    boolean isWhitespace(int ch) {
		return false;
    }

    byte getDirectionality(int ch) {
		return Character.DIRECTIONALITY_UNDEFINED;
    }

    boolean isMirrored(int ch) {
		return false;
    }

    static final CharacterData instance = new CharacterDataUndefined();
    private CharacterDataUndefined() {};
}

深入学习java源码之Character.Subset与Character.UnicodeBlock

猜你喜欢