Java Core Technology Interview Essentials (Lecture 7) | What is the difference between int and Integer?

Although Java is known as an object-oriented language, primitive data types are still important components, so in interviews, Java language features such as primitive data types and packaging classes are often examined.

The question I want to ask you today is, what is the difference between int and Integer? Talk about the value cache range of Integer.


Typical answer

Int is the integer number we often say, and it is one of Java's 8 primitive data types (Primitive Types, boolean, byte, short, char, int, float, double, long). Although the Java language claims that everything is an object, primitive data types are the exception.

Integer is a wrapper class corresponding to int. It has an int type field to store data and provides basic operations, such as mathematical operations, conversion between int and string, etc. In Java 5, automatic boxing and automatic unboxing functions (boxing/unboxing) have been introduced. Java can automatically convert based on the context, which greatly simplifies related programming.

Regarding the value cache of Integer, this involves another improvement in Java 5. The traditional way to construct an Integer object is to directly call the constructor and directly new an object. But according to practice, we found that most of the data operations are concentrated in a limited and smaller range of values. Therefore, a static factory method valueOf has been added in Java 5, and a caching mechanism will be used when calling it. Significant performance improvement. According to the Javadoc, the default cache for this value is between -128 and 127.

Test site analysis

Today this question covers two basic elements in Java: primitive data types and packaging classes. Speaking of this, it can be very naturally extended to automatic boxing, automatic unboxing mechanism, and then to investigate some of the design and practice of the package class. Frankly speaking, understanding the basic principles and usage is enough for daily work needs, but there are still many issues that require careful consideration to determine if it is implemented in specific scenarios.

The interviewer can combine other aspects to examine the interviewer’s mastery and thinking logic, such as:

  • The different stages of Java usage that I introduced in the first lecture of the column: compilation stage, runtime, what stage does auto-boxing/auto-unboxing happen?
  • I mentioned earlier that using the static factory method valueOf will use the caching mechanism. Will the caching mechanism work when autoboxing?
  • Why do we need primitive data types? Java objects seem to be very efficient. What are the specific differences in applications?
  • Have you read the Integer source code? Analyze the design points of the following categories or some methods.

There seems to be too much content to discuss, let's analyze it together.

Knowledge expansion

1. Understand automatic packing and unpacking

Autoboxing is actually a kind of syntactic sugar. What is syntactic sugar? It can be simply understood that the Java platform automatically performs some conversions for us to ensure that different writing methods are equivalent at runtime. They occur during the compilation phase, that is, the generated bytecodes are consistent. (Netizen MaoYu's notes: Syntactic sugar: I understand it is similar to arrow functions in javascript, lambda expressions in java, so that the code looks simple)

Like the integer mentioned earlier, javac automatically converts the boxing to Integer.valueOf() for us, and replaces the unboxing with Integer.intValue(). This seems to answer another question by the way, since the call is Integer. valueOf, you can naturally get the benefits of caching.

How to verify the above conclusion programmatically?

You can write a simple program containing the following two lines of code, and then decompile it. Of course, this is a method that reverses performance. In most cases, it is more reliable for us to directly refer to the specification documents. After all, the software promises to follow the specification, not to maintain the current behavior.

Integer integer = 1;
int unboxing = integer ++;

Decompilation output:

1: invokestatic  #2                  // Method
java/lang/Integer.valueOf:(I)Ljava/lang/Integer;
8: invokevirtual #3                  // Method
java/lang/Integer.intValue:()I

This caching mechanism is not unique to Integer, it also exists in some other packaging classes, such as:

  • Boolean, the true/false corresponding instances are cached, to be precise, only two constant instances Boolean.TRUE/FALSE will be returned.
  • Short, also cached values ​​between -128 and 127.
  • Byte, the value is limited, so all are cached.
  • Character, the cache range is'\u0000' to'\u007F'. 

Auto-boxing/auto-unboxing seems cool, is there anything to pay attention to in programming practice?

In principle, it is recommended to avoid unintentional boxing and unboxing behaviors (Netizen MaoYu notes: Avoid unintentional boxing and unboxing operations. The specific implementation is to express the operations that require unboxing and unboxing in the code. Rather than waiting for Java to help you automatically unpack and unpack), especially in performance-sensitive occasions, the overhead of creating 100,000 Java objects and 100,000 integers is not an order of magnitude, whether it is memory usage or processing speed, just The space occupied by the object head is already an order of magnitude difference.

In fact, we can expand this point of view, using primitive data types, arrays and even local code implementations, etc. It often has a relatively large advantage in extremely performance-sensitive scenarios. It can be used to replace packaging classes, dynamic arrays (such as ArrayList), etc. Alternatives for performance optimization. Some products or libraries that pursue extreme performance will try to avoid creating too many objects. Of course, in most product codes, there is no need to do this, and development efficiency is the priority. Take the counter implementation that we often use as an example, the following is a common thread-safe counter implementation.

class Counter {
    private final AtomicLong counter = new AtomicLong();  
    public void increase() {
        counter.incrementAndGet();
    }
}

If you use the original data type, you can modify it to

class CompactCounter {
    private volatile long counter;
    private static final AtomicLongFieldUpdater<CompactCounter> updater = AtomicLongFieldUpdater.newUpdater(CompactCounter.class, "counter");
    public void increase() {
        updater.incrementAndGet(this);
    }
}

2. Source code analysis

Examining whether you have read and understood the JDK source code may be the focus of some interviewers. This is not a strict requirement. Reading and practicing high-quality code is also the only way for programmers to grow. Let me analyze it below. Integer's source code.

Take a look at the responsibilities of Integer as a whole. It mainly includes various basic constants, such as maximum value, minimum value, number of digits, etc.; various static factory methods mentioned earlier, valueOf(); methods for obtaining environmental variable values; various conversions Methods, such as converting to a string of different bases, such as octal, or vice versa. Let's take a closer look at some interesting places.

First of all, continue to dig deeper into the cache. Although the default cache range of Integer is -128 to 127, in special application scenarios, for example, we know that the application will frequently use larger values. What should we do at this time?

The upper limit of the cache can actually be adjusted as needed. JVM provides parameter settings:

-XX:AutoBoxCacheMax=N

 These implementations are reflected in the java.lang.Integer source code and implemented in the static initialization block of IntegerCache.

 

private static class IntegerCache {
        static final int low = -128;
        static final int high;
        static final Integer cache[];
        static {
            // high value may be configured by property
            int h = 127;
            String integerCacheHighPropValue =                VM.getSavedProperty("java.lang.Integer.IntegerCache.high");
            ...
            // range [-128, 127] must be interned (JLS7 5.1.7)
            assert IntegerCache.high >= 127;
        }
        ...
  }

Second, when we analyzed the design and implementation of strings, we mentioned that strings are immutable, ensuring basic information security and thread safety in concurrent programming. If you look at the member variable "value" in the wrapper class, you will find that whether it is Integer or Boolean, they are all declared as "private final", so they are also immutable types!

This design is understandable, or a necessary choice. Imagine this application scenario. For example, Integer provides the getInteger() method to easily read system properties. We can use properties to set the port of a server service. If I can easily change the obtained Integer object to For other values, this will cause serious problems in product reliability.

Third, packaging classes such as Integer define constants like SIZE or BYTES. What design considerations does this reflect? If you have used other languages, such as C and C++, the number of bits of similar integers is actually uncertain. It may be very different on different platforms, such as 32-bit or 64-bit platforms. So, in 32-bit JDK or 64-bit JDK, is there any difference in the number of data bits? In other words, this question can be extended to, I use 32-bit JDK to develop and compile programs and run on 64-bit JDK. Do I need to do any special porting work?

In fact, this kind of transplantation is relatively simple for Java, because there are no differences in primitive data types. These are clearly defined in the Java language specification . Regardless of whether it is a 32-bit or 64-bit environment, developers do not need to worry about the number of bits of data. difference.

For application migration, although there are some differences in the underlying implementation, for example, the objects in the 64-bit HotSpot JVM are larger than the 32-bit HotSpot JVM (the specific difference depends on the choice of different JVM implementations), but in general, there is no difference in behavior. Migration can still achieve the proclaimed "write once, execute everywhere", and application developers need to consider the differences in capacity and capabilities more.

3. Primitive type thread safety

I mentioned thread-safety design. Have you ever wondered whether primitive data type operations are thread-safe?

There may be different levels of problems:

  •  Variables of primitive data types obviously need to use concurrency-related methods to ensure thread safety. I will introduce these in detail in the topic of concurrency at the back of the column. If there is a need for thread-safe computing, it is recommended to consider using thread-safe classes like AtomicInteger and AtomicLong.
  • In particular, some relatively wide data types, such as float and double, cannot even guarantee the atomicity of the update operation. It may happen that the program reads the value that only updates half of the data bits!

4. Limitations of Java primitive data types and reference types

I talked about a lot of technical details before, and finally from the perspective of the development of the Java platform, the limitations and evolution of primitive data types and objects.

For Java application developers, designing a complex and flexible type system seems to have become commonplace. But frankly speaking, after all, the design of this type of system originated from a technical decision many years ago, and now it has gradually exposed some side effects, such as:

  • Primitive data types and Java generics cannot be used together

This is because Java's generics can be regarded as pseudo-generics to some extent. It is a compile-time technique. Java will automatically convert types to corresponding specific types during compilation, which determines the use of generics. Must ensure that the corresponding type can be converted to Object.

  • Inability to express data efficiently, nor to express complex data structures, such as vector and tuple

We know that Java objects are all reference types. If it is an array of primitive data types, it is a contiguous memory in the memory, but the object array is not. The data is stored by reference, and the objects are often scattered and stored in the heap. position. Although this design brings great flexibility, it also leads to the inefficiency of data operations, especially the inability to make full use of modern CPU cache mechanisms.

Java has built-in support for various polymorphism and thread safety for objects, but this is not a requirement for all occasions, especially as the importance of data processing is increasing, and more high-density value types are very realistic requirements.

In view of these enhancements, the development is currently underway in the OpenJDK field. If you are interested, you can pay attention to related projects: http://openjdk.java.net/projects/valhalla/.

Practice one lesson

Do you know what we are discussing today? Leave a question for you. As mentioned earlier, from a spatial perspective, Java objects are much more expensive than primitive data types. Do you know what the memory structure of an object looks like? For example, the structure of the object header. How to calculate or get the size of a Java object?


Other classic answers 

The following is the answer from the netizen kursk.ye:

This article is rather fragmented, and the overall idea is not linked together. In fact, I think I can understand this problem from such a clue. Primitive data types and Java generics cannot be used together, that is, Primitive Types and Generic cannot be mixed, so JAVA designed this auto-boxing/unboxing mechanism, which is actually an implicit conversion mechanism between primitive value and object, otherwise Without this mechanism, developers would have to manually display the conversion every time. Wouldn't it be troublesome? But primitive value and object each have their own advantages. Primitive value stores the value in memory, so you can get the value by finding the memory location of the primitive value; unlike the object where the reference is stored, you need to find the memory location of the object. Find the next memory space according to the reference, which will generate more IO, so the calculation performance is worse than the primitive value, but the object has the generic ability, is more abstract, and solves business problems with high programming efficiency. Therefore, the original intention of JAVA designers is estimated to be this: if developers want to do calculations, they should use primitive value. If developers want to deal with business problems, they should use object and use the Generic mechanism; anyway, JAVA has auto-boxing/unboxing mechanisms, There is no need to pay attention to developers. Then in order to make up for the lack of object computing power, the static valueOf() method is also designed to provide a caching mechanism, which can be regarded as a make up.

The following is the answer from the netizen's official account-Technology Sleeplessly:

1 Int and Integer

JDK1.5 introduced automatic boxing and automatic unboxing functions. Java can realize automatic conversion between basic types such as int/Integer, double/Double, boolean/Boolean and corresponding objects according to the context, which is the development process Bring great convenience.

The most commonly used is to construct Integer objects through the new method. However, based on the fact that most data operations are concentrated in a limited and small range of values, a static factory method valueOf was added in JDK1.5, and the implementation behind it is an Integer object whose int value is between -128 and 127 Cache, get directly from the cache when calling, and then improve the performance of the constructed object, that is to say, after using this method, if the int value of the two objects is the same and falls within the cache value range, then the two objects are the same An object; when the value is small and frequently used, it is recommended to use the integer pool method first (good time and space performance).

2 Precautions

[1] The basic types all have a range of values. When large numbers * large numbers, there may be out-of-bounds situations.
[2] When converting basic types, use the method of declaration. Example: long result = 1234567890 * 24 * 365; the result value will definitely not be the value you expect, because 1234567890 * 24 has exceeded the range of int, if you modify it to: long result = 1234567890L * 24 * 365; it is normal Up.
[3] Use basic types with caution when dealing with currency storage. If using double often brings gaps, BigDecimal and integer types are often used (if you want to accurately express the points, you can expand the value by 100 times and convert it to an integer type) to solve the problem.
[4] Prefer basic types. In principle, it is recommended to avoid unintentional packing and unpacking, especially in performance-sensitive occasions.
[5] If there is a need for thread-safe computing, it is recommended to consider using thread-safe classes such as AtomicInteger and AtomicLong. Some relatively wide basic data types, such as float and double, cannot even guarantee the atomicity of the update operation. It may happen that the program reads a value that only updates half of the data bits.

The following is the answer from the netizen Seven.Lin Ze Geng:

Do you know what the memory structure of an object looks like? For example, the structure of the object header. How to calculate or obtain the size of a Java object?
Java object memory structure:         -Basic
data type
-Object type
    -Object header (Header)
-MarkWord, 4 bytes
        -Class object pointer, 4 bytes
    -Instance Data         -Alignment
    data (Padding), aligned by 8 bytes
-Array type
    -Object header (Header)
-MarkWord, 4 bytes
        -Class object pointer, 4 bytes
    -array length, 4 bytes
    -Instance Data (Instance Data) -Alignment
    data (Padding), aligned by 8 bytes
How to get the object size:
-Instrumentation + premain implementation tool class: Instrumentation.getObjectSize() to get
-Unsafe, Unsafe.objectFieldOffset() to get

 

Guess you like

Origin blog.csdn.net/qq_39331713/article/details/114091965