I said I have a good string, the interviewer actually asked me in Java String has no length limit! ?

Java String is a very important type of data, in addition to the basic data types, String is the most widely used, but, on String, in fact, there are still many things easily overlooked.

As this paper we discuss the problem: Java in String has no length limit?

The question to see two stages, namely, compile and run. Different time limit is not the same.

Compile

First, let's take a reasonable inferences about when we use the code String s = ""; String object to define the form of time, "" the number of characters in there a limit to it?

Since it is a reasonable inference, it would have to be sufficient basis, so we can start from the String source, according to public String (char value [], int offset, int count) is defined, count is type int, therefore, char value [] you can save up to Integer.MAX_VALUE, namely 2147483647 characters. (Jdk1.8.0_73)

However, experiments show, String s = "";, you can have up to 65,534 characters. If you exceed this number. It will error at compile time.

public static void main(String[] args) {

    String s = "a...a";// 共65534个a
    System.out.println(s.length());

    String s1 = "a...a";// 共65535个a
    System.out.println(s1.length());
}
复制代码

The above code will be in String s1 = "a ... a"; // a total of 65535 at compile failed:

✗ javac StringLenghDemo.java
StringLenghDemo.java:11: 错误: 常量字符串过长
复制代码

It's clear that a good length limit is 2147483647, why 65535 characters will not be able to compile it?

When we use string literals directly define a String, the string will be in a constant pool of storage. 65534 then the above-mentioned fact is to limit the constant pool.

Each data item in the constant pool also has its own type. UTF-8 encoding of a Unicode string in Java CONSTANT_Utf8 type represented in the constant pool.

CONSTANTUtf8info CONSTANTUtf8 type is a constant pool data items, it is a constant string is stored. All literal constant pool are almost always described by CONSTANTUtf8info. CONSTANTUtf8_info defined as follows:

CONSTANT_Utf8_info {
    u1 tag;
    u2 length;
    u1 bytes[length];
}
复制代码

Since the focus of this article is not CONSTANTUtf8info introduction, do not start here in detail, and we just need to use our string literal defined in the class file is stored using CONSTANTUtf8info, and CONSTANTUtf8info have u2 length; indicate the type the length of the stored data.

u2 is an unsigned 16-bit integers, the maximum permissible length is theoretically 2 ^ 16 = 65536. And java class file is a UTF-8 format to store character variant with, null values ​​using two bytes to represent, thus leaving 65536-2 = 65534 bytes.

In this regard, in the class file format spec is also clearly stated:

The length of field and method names, field and method descriptors, and other constant string values is limited to 65535 characters by the 16-bit unsigned length item of the CONSTANTUtf8info structure (§4.4.7). Note that the limit is on the number of bytes in the encoding and not on the number of encoded characters. UTF-8 encodes some characters using two or three bytes. Thus, strings incorporating multibyte characters are further constrained.
复制代码

That is, in Java, all data needs to be saved in the constant pool, the maximum length of not more than 65,535, which of course also slightly included in the definition of the string.

Runtime

String This limits the length of the above-mentioned limitation of the compiler, which is used String s = ""; literal manner when this definition will be some limits.

Well. String at runtime has no limit, the answer is that there Integer.MAX_VALUE that we mentioned before, this value is approximately equal to 4G, at runtime, if the length of the String exceeds this range, it may throw an exception. (Before jdk 1.9)

int type is a 32-bit variable, takes a positive number part to count the words, they can be up to

2^31-1 =2147483647 个 16-bit Unicodecharacter

2147483647 * 16 = 34359738352 位
34359738352 / 8 = 4294967294 (Byte)
4294967294 / 1024 = 4194303.998046875 (KB)
4194303.998046875 / 1024 = 4095.9999980926513671875 (MB)
4095.9999980926513671875 / 1024 = 3.99999999813735485076904296875 (GB)
复制代码

It has a capacity of nearly 4G.

Guess you like

Origin juejin.im/post/5d53653f5188257315539f9a