Analysis of Java String Class

1. The particularity of String

1.1 Object Creation

public class Test {
	public static void main(String[] args) {
		String str1 = new String("abcd");
		String str2 = "abcd";
	}
}

1.1.1 The object creation process is explained below:

Java's special management of the String class:

>Java的每一个类都有一个常量池,这个常量池定义在class文件中有描述(javap -v 类的全类名),包括值、标识符(举个例子:String a ="astr";int b = 1;这些代码中的 标识符a,b和值 “astr” 都是常量池的内容,而1 则会嵌入的操作指令中)、属性名、类名、方法名等。
>每个JVM实例同时会在方法区维持一个String pool,在装载每个类时的会解析类的常量池,先在javaheap内创建这些这些常量对象返回其引用然后=》将这些字符串常量的引用存储到String pool中=》最后将这些引用给每个类的常量池。
> ==是比较地址,String 类重写了 equals()方法,比较对象内容(比较的是底层的char[] 内每一个字符是否相等)
> 个人猜测:string pool中存的是引用值而不是具体的实例对象,具体的实例对象是在堆中开辟的一块空间存放的。但是怎么验证还没有教好的思路。

1.1.1.1 The creation process of String str1 = "abcd":

Implementation process:

    >首先栈区创建str1引用,然后在String池中寻找指向的内容为"abcd"的对象的引用,如果String池中没有,则javaheap创建一个对象返回引用到String Pool,最后返回指向String池中的引用赋值给str1;如果有,则直接返回引用赋值给str1;

Inference and verification:

    >如果后来又定义了字符串变量 str2 = "abcd",则直接将str2引用指向String池中已经存在的“abcd”,不再重新创建对象;这时str1==str2。

But one thing to note is that::

    >Java 语言提供了 字符串连接符号("+")以及将其他对象转换为字符串的特殊支持。
        >字符串连接 底层是通过 StringBuilder(或 StringBuffer)类及其 append 方法实现的(jjava heap创建buffer或者builder对象,然后append返回);
        >字符串转换是通过 toString 方法实现的,该方法由 Object 类定义,并可被 Java 中的所有类继承。

Validation of note points:

        >如果内容为"abc"的str2进行了字符串的"+"连接str2 = str2+"d";此时str2指向的是在堆中新建的内容为"abcd"的对象,即此时进行str1==str2,返回值false,因为地址不一样。

1.1.1.2 The creation process of String str = new String("abcd"):

Implementation process:

>直接在堆中创建对象返回引用。如果后来又有String str3 = new String("abcd"),str3不会指向String pool里面的对象,而是在堆中重新创建一个对象并指向它。

Ways of identifying:

>如果此时进行str2==str3以及str1 == str3 均会返回false,因为两个对象的地址不一样,如果是str2.equals(str3),返回true,因为内容相同。

One thing to note:

>str.intern()这个方法就是将str指向的String对象内容,存储一份到String pool里并返回在String pool里的“引用”;

1.2 The immutable of String objects

1.2.1 Why is it designed as final?

Because the creation and destruction of the String class involves the mechanism of the JVM, general programmers do not have this ability.

1.2.2 We cannot use the API provided by String to change the content of the original object

public class Base {

	public static void main(String[] args) {
		String str1 = new String("abcd");
		String str2 = "abcd";
		String str3 = "abcd";
		String str4 = new String("abc") + "d";// 连接 请勿直接使用 "abc"+"d" 会被编译器有化成  "abcd"

		System.out.println(str2 == str1);// false
		System.out.println(str2 == str3);// true
		System.out.println(str2 == str4);// false
		System.out.println(str2 == str1.intern() && str2 == str3.intern() && str2 == str4.intern());// true 都是String pool里面的引用

		String str5 = str1.replace("a", "z");// 替换字符
		System.out.println(str1);
		System.out.println(str5);

		String str6 = str1.substring(2);// 截取字符
		System.out.println(str1);
		System.out.println(str6);

		String str7 = str1.toLowerCase();// 转小写
		System.out.println(str1);
		System.out.println(str7);

		String str8 = str1.toUpperCase();// 转大写
		System.out.println(str1);
		System.out.println(str8);

		String str9 = str1.trim();// 去除两端空格字符
		System.out.println(str1);
		System.out.println(str9);
	}

}

The methods such as the above and so on have not changed the value of str1 and the value it points to the String object. Because all API-related operations are performed after deepcopying the underlying char[] value

1.2.3 But reflection can do it

public class Test {
	public static void main(String[] args) throws NoSuchFieldException, SecurityException, IllegalArgumentException, IllegalAccessException {
		String str1 = new String("abcd");
		//here is an way to change str1
		Class clazz = str1.getClass();
		Field valueField = clazz.getDeclaredField("value");
		valueField.setAccessible(true);
		char [] str1Changed = new char[]{'z','x','c','v'};
		valueField.set(str1, str1Changed);
		System.out.println(str1);//zxcv
	}
}

This is to use reflection to modify the content of the object pointed to by str1

1.2.4 Advantages of Immutable Objects

1 天生的线程安全性(只能读取不能修改)
2 在性能上的提升(可缓存,不必每次都要申请内存初始化等从而提升性能)

2. About characters, encoding and garbled characters

String: "character" string, this character is the abstract symbol that we usually understand (such as "a", "b", "中", etc.), but this abstract symbol cannot be stored in the computer (only the value can be stored ), can only rely on the mapping relationship between values ​​and symbols (coded character set) to solve the corresponding relationship between values ​​and symbols and then display the characters. There are several reasons for the common display garbled characters on the computer:

第一个字节序列本身有问题(文件破坏掉了,这个情况较少,而且基本无解);
第二个就是我们解码的方式不对(这种常见,本身是UTF-8格式编码的,我们却以GBK的形式解码,修改解码方式),
第三个就是我们缺少对应的显示方法(这种也常见,文件是UTF-8的我们以UTF-8解码,但是其中某些字码对应国外的文字符号,我们缺少显示方法,解决方法为安装缺少对应字符集),

Refer to the following code:

public class Test {
	public static void main(String[] args) throws UnsupportedEncodingException {
		String str = "wo是中国人";//默认编码格式UTF-8
		byte [] gbkbytes = str.getBytes("GBK");//进行GBK编码
		byte [] utf8bytes = str.getBytes("UTF-8");//进行UTF-8编码
		System.out.println(gbkbytes.length);
		assert gbkbytes.length == 10;//1英2中
		System.out.println(utf8bytes.length);
		assert utf8bytes.length == 14;//1英三中
		//对gbkbytes 进行 UTF8解码
		System.out.println(new String(gbkbytes, "UTF-8"));//wo���й�
		//对utf8bytes 进行GBK解码
		System.out.println(new String(gbkbytes, "GBK"));//wo是中国人
	}
	
}

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324932294&siteId=291194637