String s = new String("xyz") How many instances are created?

Quote

Problem: Java code

String s = new String(“xyz”);

How many String Objects have been created?

There is no reasonable answer to this question itself

Quote

Answer: Two (one is "xyz" and the other is the reference object s pointing to "xyz")
(Well, this answer has a lot of complaints... everyone take it easy)

What's wrong with this problem? It does not define the meaning of "created".
What is "created"? When was it created?
And when this piece of Java code actually runs, will it really "create two String instances"?

If this is an interview question, you can ask the interviewer to clarify the definition of "created" in person, and then answer accordingly. At this time, the interviewer will most likely let the interviewee explain by himself, then it is easy to handle, and show it to the interviewer.

If it is a written test, there is no opportunity to ask for clarification. However, most of the places where this kind of problem arises are not very good. Maybe the person who wrote the question copied the question from various books, then you can mix it up by writing the wrong answer according to the book.

=======================================================

Let’s change to another question to ask:

Java code

String s = new String("xyz"); How
many String instances are involved at runtime?

A reasonable answer is:

Answer: Two, one is the instance corresponding to the string literal "xyz" and resides (intern) in a globally shared string constant pool, the other is created and initialized by new String(String), Example with the same content as "xyz"

"2020 the latest Java basics and detailed video tutorials and learning routes! "
This is a reasonable answer that can be given according to the relevant provisions of the Java language specification. Considering that the Java language specification clearly states:

The Java Language Specification, Third Edition 写道

The Java programming language is normally compiled to the bytecoded instruction set and binary format defined in The Java Virtual Machine Specification, Second Edition (Addison-Wesley, 1999).

That is to say, it is stipulated that the Java language is generally compiled into the Class file defined by the Java virtual machine specification, but it does not stipulate "must", leaving room for implementing the Java language without using the JVM.
Considering the Java virtual machine specification, the constant type CONSTANT_String_info involved in this code is indeed the only string constant "xyz". CONSTANT_String_info is a constant type used to express the value of String type constant expressions (including string literals) in the Java language. If you only consider this level, this answer is no problem.
So this solution can be considered reasonable.

It is worth noting that "at runtime" in the question includes both the class loading phase and the execution time of the code fragment itself. The relationship between this detail and the original question given by the original poster will be discussed below.

When encountering this kind of problem, you should first think about consulting related specifications, here are the Java language specifications and the Java virtual machine specifications, as well as the JavaDoc of some related APIs. Many people like to take "reasonable" as a catchphrase. The norms are used to define various "reasons"-"Why does XXX mean YYY?" "Because it is defined in the norm!"-invincible.

Related definitions in the Java Virtual Machine Specification are as follows:

The Java Virtual Machine Specification, Second Edition 写道

2.3 Literals

A literal is the source code representation of a value of a primitive type (§2.4.1), the String type (§2.4.8), or the null type (§2.4). String literals and, more generally, strings that are the values of constant expressions are “interned” so as to share unique instances, using the method String.intern.

The null type has one value, the null reference, denoted by the literal null. The boolean type has two values, denoted by the literals true and false.

2.4.8 The Class String

Instances of class String represent sequences of Unicode characters (§2.1). A String object has a constant, unchanging value. String literals (§2.3) are references to instances of class String.

2.17.6 Creation of New Class Instances

A new class instance is explicitly created when one of the following situations occurs:

Evaluation of a class instance creation expression creates a new instance of the class whose name appears in the expression.
Invocation of the newInstance method of class Class creates a new instance of the class represented by the Class object for which the method was invoked.

A new class instance may be implicitly created in the following situations:

Loading of a class or interface that contains a String literal may create a new String object (§2.4.8) to represent that literal. This may not occur if the a String object has already been created to represent a previous occurrence of that literal, or if the String.intern method has been invoked on a String object representing the same string as the literal.
Execution of a string concatenation operator that is not part of a constant expression sometimes creates a new String object to represent the result. String concatenation operators may also create temporary wrapper objects for a value of a primitive type (§2.4.1).

Each of these situations identifies a particular constructor to be called with specified arguments (possibly none) as part of the class instance creation process.

5.1 The Runtime Constant Pool

● A string literal (§2.3) is derived from a CONSTANT_String_info structure (§4.4.3) in the binary representation of a class or interface. The CONSTANT_String_info structure gives the sequence of Unicode characters constituting the string literal.

● The Java programming language requires that identical string literals (that is, literals that contain the same sequence of characters) must refer to the same instance of class String. In addition, if the method String.intern is called on any string, the result is a reference to the same class instance that would be returned if that string appeared as a literal. Thus,
Java代码

(“a” + “b” + “c”).intern() == “abc”

must have the value true.

● To derive a string literal, the Java virtual machine examines the sequence of characters given by the CONSTANT_String_info structure.

○ If the method String.intern has previously been called on an instance of class String containing a sequence of Unicode characters identical to that given by the CONSTANT_String_info structure, then the result of string literal derivation is a reference to that same instance of class String.

○ Otherwise, a new instance of class String is created containing the sequence of Unicode characters given by the CONSTANT_String_info structure; that class instance is the result of string literal derivation. Finally, the intern method of the new String instance is invoked.

The remaining structures in the constant_pool table of the binary representation of a class or interface, the CONSTANT_NameAndType_info (§4.4.6) and CONSTANT_Utf8_info (§4.4.7) structures are only used indirectly when deriving symbolic references to classes, interfaces, methods, and fields, and when deriving string literals.

Think of Sun's JDK as a reference implementation (RI). The JavaDoc for String.intern() is:

JavaDoc writes

public String intern()

Returns a canonical representation for the string object.

A pool of strings, initially empty, is maintained privately by the class String.

When the intern method is invoked, if the pool already contains a string equal to this String object as determined by the equals(Object) method, then the string from the pool is returned. Otherwise, this String object is added to the pool and a reference to this String object is returned.

It follows that for any two strings s and t, s.intern() == t.intern() is true if and only if s.equals(t) is true.

All literal strings and string-valued constant expressions are interned. String literals are defined in §3.10.5 of the Java Language Specification

Returns:
a string that has the same contents as this string, but is guaranteed to be from a pool of unique strings.

=======================================================

Ask another question:

String s = new String("xyz"); How
many String type variables are involved in user declaration?

The answer is also very simple:

Answer: One is String s.

Change the question to the following version, and the answer is the same:

String s = null;
involves several String type variables declared by the user?

Variables in Java are variables. Variables of reference type are just references to an object instance or null, not the instance itself. The number of declared variables is not necessarily related to the number of instances created, like saying:

String s1 = "a";
String s2 = s1.concat("");
String s3 = null;
new String(s1);
This code will involve 3 String type variables,
1, s1, pointing to the following String instance 1
2. s2, pointing to the same as s1
3, s3, the value is null, not pointing to any instance

And 3 String instances,

1. The String instance of the resident string constant corresponding to the "a" literal

2. String instance of the resident string constant corresponding to the "" literal

(String.concat() is an interesting method, it will return this when it finds that the passed parameter is an empty string, so no additional String instance will be created here)

3. A new String instance created by new String(String); there is no variable pointing to it.

Back to the questions and "standard answers" cited at the beginning

String s = new String("xyz"); How
many String Objects have been created?

Answer: Two (one is "xyz" and the other is the reference object s pointing to "xyz")
are demonstrated by reductionism. Assume that the question asks "Several String instances were created while executing this code snippet". If the "standard answer" is correct, then the following code fragment should create 4 String instances when executed:

String s1 = new String("xyz");
String s2 = new String("xyz");
Someone will immediately jump out and say that the upper and lower "xyz" literals refer to the same String object, so it should not be 4 objects were created.

How many should it be?

The class loading process at runtime and the actual execution of a certain code fragment must be discussed separately to be meaningful.

In order to execute the code fragment in the question, the class in which it is located must be loaded first, and the same class will only be loaded once at most (note that for the JVM, "the same class" is not enough for the fully qualified name of the class to be the same , But <class fully qualified name, definition of class loader> is the same pair).

According to the content of the specification cited above, a JVM implementation that conforms to the specification should create and reside a String instance as a constant to correspond to the "xyz" literal during the class loading process; specifically, it is done in the resolve phase of the class loading. This constant is shared globally, and a new String instance needs to be created only if no string with the same content has resided before.

When the code fragment in the original question is actually executed, the bytecode that the JVM needs to execute is similar to this:

Java bytecode code:

0: new #2; //class java/lang/String
3: dup
4: ldc #3; //String xyz
6: invokespecial #4; //Method java/lang/String."":(Ljava/lang/ String;) V
9: astore_1
How many times new java/lang/String has appeared in this is how many String objects are created. In other words, the code in the original question will only create a new String instance every time it is executed.
Here, the ldc instruction just pushes a reference of a String object ("xyz") that has been created during the class loading process to the top of the operand stack, and does not create a new String object.

So the code snippet just used for reduction:

String s1 = new String("xyz");
String s2 = new String("xyz");
Each execution will only create 2 new String instances.


In order to avoid some students getting confused, I would like to emphasize again:

In the Java language, the "new" expression is responsible for creating an instance, in which the constructor is called to initialize the instance; the return value type of the constructor itself is void, not "the constructor returns a reference to the newly created object ", but the value of the new expression is a reference to the newly created object.

Correspondingly, in the JVM, the "new" bytecode instruction is only responsible for creating the instance (including allocating space, setting the type, setting default values ​​for all fields, etc.), and pressing the reference to the newly created object to the operand Top of the stack. At this time, the reference cannot be used directly and is in an uninitialized state; if a method a contains code that tries to call any instance method through the reference in the uninitialized state, then method a will fail the JVM bytecode verification. As a result, JVM refused to execute.

The only thing that can be done with a reference to the uninitialized state is to call the instance constructor through it, which is represented as a special initialization method "" at the Class file level. The actual call instruction is invokespecial, and the required parameters must be sequentially pressed onto the operand stack before the actual call. In the above bytecode example, the command to suppress the parameters includes two dup and ldc, respectively, the hidden parameter (the reference of the newly created instance, for instance constructor is "this") and the first explicitly declared An actual parameter (a reference to the "xyz" constant) is pushed onto the operand stack.

After the constructor returns, the reference of the newly created instance can be used normally.

Connection address: https://www.iteye.com/blog/rednaxelafx-774673

Guess you like

Origin blog.csdn.net/weixin_46699878/article/details/110688570