java object structure in memory (the HotSpot VM)

First, the memory layout of objects

  HotSpot virtual machine, objects stored in memory layout can be divided into three areas: object header (Header), instance data (Instance Data) and alignment padding (Padding).

As can be seen from the above inside this FIG., In memory object structure mainly comprises the following parts:

  • Mark Word (marker fields): Mark Word part of the object 4 bytes, the contents of which is a series of flag bits, such as lightweight lock flag, flag biased locking the like.
  • Klass Pointer (Class object pointer): Size Class object pointer is 4 bytes, which points to the position corresponding to the object Class object (corresponding metadata objects) memory address
  • The actual data objects: This includes all member variables of an object, the size determined by the size of the respective member variables, such as: byte 1 byte and boolean, short, and char is 2 bytes, int and float are 4 byte, long, and double is 8 bytes, reference is 4 bytes
  • Alignment: The last part of the alignment padding bytes, 8 bytes filling press.

1.1, the object head

1.1.1, Mark Word (tag field)

HotSpot virtual machine object header includes information of two parts, a first part is "Mark Word ", for storing runtime data object itself, such as a hash code (HashCode), GC generational age lock state flag, the thread holds lock, bias thread ID, time stamp, etc. bias, this part of the length of data in 32-bit and 64-bit virtual machine (not consider open compressed scene pointer), respectively, of 32 and 64 bits, call official "Mark Word". Runtime data objects need to store a lot, in fact, beyond the limits of Bitmap structure can record 32,64 bit, but the object header information and the object itself is defined in additional storage costs unrelated data, taking into account the space efficiency of the virtual machine, Mark Word is designed as a non-fixed data structures for storage in a very small space as much information, it will reuse their own storage space according to the state of the object. For example, in the 32-bit virtual machine HotSpot object is not locked, 32 Mark Word Bits in space for storing objects 25Bits hash code (HashCode), 4Bits for storing objects generational Age, for 2Bits store lock flag, 1Bit fixed to 0, in other states (locked lightweight, heavyweight lock, the GC marker, may be biased towards) the stored contents of the object shown in the following table.

But if the object is an array type, you need three machine code, because the JVM virtual machine can determine the size of Java objects through metadata information Java objects, but can not confirm the size of the array of metadata from the array, so use one to record length of the array.

 The header information is independent of the object and the object itself defined data additional storage costs, but considering the space efficiency of the virtual machine, Mark Word is designed as a non-fixed data structures as much as possible so that a minimum of space in the data storage memory, it based on the state of the object reuse their own storage space, that is to say, Mark Word will change the operation of program, change the status as follows (32-bit virtual machine): 

Table 1 HotSpot virtual machine object head Mark Word

 

Memory contents Flag status
The hash code of the object, the object generational Age 01 Unlocked
Pointers to lock records 00 Lightweight lock
Heavyweight locks pointer pointing 10 Expansion (heavyweight lock)
Empty, no need to record 11 GC mark
Bias thread ID, timestamp bias, the object generational Age 01 Be biased

Attentional bias lock, lock lightweight, heavyweight lock, etc. are introduced after the 1.6 jdk.

 

 Which is biased locking lock and lightweight Java  6 When synchronized lock to optimize the new additions, we will briefly analyze later. Here we analyze the heavyweight lock is often said synchronized object lock, the lock flag 10, which is the start address pointer monitor object (also referred to as a tube or a monitor lock process) of. Each object there is a monitor associated with the relationship between the object and its monitor the presence of a variety of implementations, such as the monitor can be created or destroyed together with the object automatically generated when a thread attempts to acquire the object lock, but when a monitor is after holding a thread, it will be locked. Java Virtual Machine (HotSpot) in, ObjectMonitor Monitor is implemented, the main data structure is as follows (in the HotSpot VM ObjectMonitor.hpp source files, C ++ implementation)

ObjectMonitor() {
    _header       = NULL;
    _count        = 0; //记录个数
    _waiters      = 0,
    _recursions   = 0;
    _object       = NULL;
    _owner        = NULL;
    _WaitSet      = NULL; //处于wait状态的线程,会被加入到_WaitSet
    _WaitSetLock  = 0 ;
    _Responsible  = NULL ;
    _succ         = NULL ;
    _cxq          = NULL ;
    FreeNext      = NULL ;
    _EntryList    = NULL ; //处于等待锁block状态的线程,会被加入到该列表
    _SpinFreq     = 0 ;
    _SpinClock    = 0 ;
    OwnerIsThread = 0 ;
  }

 ObjectMonitor there are two queues, _WaitSet and _EntryList, to save ObjectWaiter object list (each thread will wait for the lock to be packaged objects ObjectWaiter), _ owner thread holds ObjectMonitor object point, when a plurality of threads simultaneously access the synchronization period when the code, enter _EntryList first set, when the thread enters the monitor object acquired _Owner monitor area and the owner is the current thread while the variable is set to monitor the count of the counter is incremented by 1 if the calling thread wait () method, the release currently held monitor, owner variable recovery is null, count from minus 1, while the thread enters WaitSe t set waiting to be awakened. If the current thread is finished will release the monitor (lock) and resets the value of the variable, so that other threads get into the monitor (lock). As shown below

 

Seen in this light, monitor object exists in the subject header of each Java object (pointer stored point), synchronized lock lock is acquired in this way, it is why any object in Java can be used as a reason for the lock, at the same time also notify / notifyAll / wait methods exist in the top-level objects in the Object reasons (on this point will be analyzed later), ok ~, after With the above knowledge base, below we will further analyze synchronized at the byte code level specific semantic implementation.


Object header of the other part is a pointer type , i.e. it is the object points to the class metadata pointer, the virtual machine is determined by the pointer which is an instance of the object class. Not all virtual machine implementations must retain the data type of the pointer on the object, in other words to find the object of metadata information does not have to go through the object itself. Further, if the object is a Java array, in that the object must have a head for recording the data length of the array, because the virtual machine can determine the size of the Java object by ordinary Java objects metadata information, but the metadata from the array You can not determine the size of the array.
The following are the HotSpot VM markOop.cpp C ++ code (note) fragment, which describes the state storage MarkWord 32bits:

 

// Bit-format of an object header (most significant first, big endian layout below):  
//  
//  32 bits:  
//  --------  
//  hash:25 ------------>| age:4    biased_lock:1 lock:2 (normal object)  
//  JavaThread*:23 epoch:2 age:4    biased_lock:1 lock:2 (biased object)  
//  size:32 ------------------------------------------>| (CMS free block)  
//  PromotedObject*:29 ---------->| promo_bits:3 ----->| (CMS promoted object)

1.2, instance data (Instance Data)

  Next, examples of the data portion is valid information object stored in the real, but also the content of both types of field we defined in the program code which, either inherited from superclasses, or defined in the subclass needs to be recorded . This part of the storage order will be subject to allocation strategy parameters of the virtual machine (FieldsAllocationStyle) and fields defined in the order of influence in Java source code. HotSpot VM default assignment policy longs / doubles, ints, shorts / chars, bytes / booleans, oops (Ordinary Object Pointers), it can be seen from the assignment policy, the same width of the field is always assigned together. In the case meet this prerequisite, the variables defined in the parent class will appear before the subclass. If CompactFields parameter is true (the default is true), in that a narrow subclass of variables may be inserted into the void of the parent class variable.

1.3, alignment padding (the Padding)

  Part alignment padding is not necessarily exist, there is no special meaning, it only plays a role in the placeholder. Since the HotSpot VM automatic memory management system requires the starting address of the object must be an integer multiple of 8 bytes, in other words the size of the object must be an integer multiple of 8 bytes. Object header of 8 bytes is a multiple of exactly (1 or 2 times), so when the data portion of an object instance is not aligned, then it needs to be filled by the alignment completion.

Second, the process of creating an object

Java is an object-oriented programming language, Java programs run in the process at all times have the object is created. On the linguistic level, usually to create objects (exception: cloning, deserialization) is just a new keyword, but rather in a virtual machine, the object (the object discussed in this article is limited to ordinary Java objects, and does not include an array of Class objects, etc. ) and how to create a process it?
When the virtual opportunity to a new instruction,
whether 1, first jvm To check whether class A has been loaded into memory, the symbol that is kind of reference is already in the constant pool, and check this symbolic references on behalf of a class has been loaded, parsed and initialized. If not, you need to load the kind of trigger, parsing, initialization. Then create objects on the heap.

2, allocate memory for the new object.

  The size of the required memory objects can be completely determined after the class is loaded, the object is assigned the task of specific space will be equivalent to determine the size of a memory from the Java heap is divided out, how to draw it? Assume that Java heap memory is absolutely regular, all the used memory has been set aside, free memory is placed on the other side, the middle stood a pointer as an indicator cut-off point, and that the allocated memory is just the move the pointer to the idle period of the object space side of the same size and distance, this assignment is called "pointer collision" (Bump the pointer). If the Java heap memory is not regular, free memory, and memory has been used intertwined, there is no way to simply be a pointer collision, the virtual machine must maintain a list of records on which memory blocks are available to find a large enough space allocated to the object instance, and recorded on a list of updates from the list at the time of allocation, this allocation is called "free list" (free list). The choice of which way the Java heap allocation decision whether structured, and whether regular Java heap and used by the garbage collector with or without compression decided to organize functions. Therefore, the use Serial, ParNew with other collectors Compact process, the allocation algorithm uses the pointer collision system, and that when using the CMS collector Mark-Sweep Algorithm (explain, CMS collector or by UseCMSCompactAtFullCollection CMSFullGCsBeforeCompaction to organize memory), it is usually free list.
In addition to how to divide the available space outside, there is another issue to consider is the objects that are created in a virtual machine is very frequent behavior, even if only to modify the position of a pointer is pointing in the concurrent case is also not thread-safe may occur is not enough time to modify the object a memory allocation, pointer, object B and at the same time using the original pointer to allocate memory. To solve this problem there are two, one in the operation of memory space allocated to synchronize - in fact, the virtual machine is coupled by way of CAS failure retry guarantee atomic update operation; the other is the memory allocation action in accordance threads divided among the different spaces that each thread a small piece of pre-allocated heap memory in Java, called thread-local allocation buffer, (TLAB, thread local allocation buffer), which thread to allocate memory, at the TLAB which thread allocation, only TLAB run out, the allocation of new TLAB only need to synchronize locked. Virtual machine using TLAB, by -XX: +/- UseTLAB to set the parameters.

3. Examples of the completion of initialization data portion (initialized to 0)

  After the completion of memory allocation, virtual machines need to be allocated to the memory space are initialized to zero value (not including the object header), if TLAB words, this work can be carried out to advance TLAB distribution. This operation ensures that the object instance field may not be assigned an initial value directly using Java code, the program can access the data types of zero values ​​corresponding to the fields.

4, the first fill-completed objects: object itself, such as runtime data, and other types of indicators.

  Next, the virtual machine is subject to the necessary settings, for example, which this object is an instance of the class, how to find information such as metadata, hash code of the object, GC target sub-generation information age. This information is stored in the object object head (Object Header) in. Depending on the current operating status of the virtual machine, such as whether to enable biased lock, head objects have different set up.

After completion of the above work, the virtual machine in perspective view, a new object has been created. But in the perspective of a Java program seems to initialize before the official start, start calling <init> method and complete the initial copy constructor, all fields are zero values. Therefore, in general it will be performed followed by <init> method after (whether by a bytecode instruction followed invokespecial determined), new new instruction, to initialize the object in accordance with the wishes of the programmer, the object of such a truly considered completely available created.


The following code snippet HotSpot virtual machine bytecodeInterpreter.cpp in (the interpreter little opportunity to achieve practical use, use the template interpreter on most platforms; When the code is executed by the JIT compiler but this difference is even greater sections of the code used to understand the operation of the process HotSpot is no problem).

// 确保常量池中存放的是已解释的类
    if (!constants->tag_at(index).is_unresolved_klass()) {
      // 断言确保是klassOop和instanceKlassOop(这部分下一节介绍)
      oop entry = (klassOop) *constants->obj_at_addr(index);
      assert(entry->is_klass(), "Should be resolved klass");
      klassOop k_entry = (klassOop) entry;
      assert(k_entry->klass_part()->oop_is_instance(), "Should be instanceKlass");
      instanceKlass* ik = (instanceKlass*) k_entry->klass_part();
      // 确保对象所属类型已经经过初始化阶段
      if ( ik->is_initialized() && ik->can_be_fastpath_allocated() ) {
        // 取对象长度
        size_t obj_size = ik->size_helper();
        oop result = NULL;
        // 记录是否需要将对象所有字段置零值
        bool need_zero = !ZeroTLAB;
        // 是否在TLAB中分配对象
        if (UseTLAB) {
          result = (oop) THREAD->tlab().allocate(obj_size);
        }
        if (result == NULL) {
          need_zero = true;
          // 直接在eden中分配对象
    retry:
          HeapWord* compare_to = *Universe::heap()->top_addr();
          HeapWord* new_top = compare_to + obj_size;
          // cmpxchg是x86中的CAS指令,这里是一个C++方法,通过CAS方式分配空间,并发失败的话,转到retry中重试直至成功分配为止
          if (new_top <= *Universe::heap()->end_addr()) {
            if (Atomic::cmpxchg_ptr(new_top, Universe::heap()->top_addr(), compare_to) != compare_to) {
              goto retry;
            }
            result = (oop) compare_to;
          }
        }
        if (result != NULL) {
          // 如果需要,为对象初始化零值
          if (need_zero ) {
            HeapWord* to_zero = (HeapWord*) result + sizeof(oopDesc) / oopSize;
            obj_size -= sizeof(oopDesc) / oopSize;
            if (obj_size > 0 ) {
              memset(to_zero, 0, obj_size * HeapWordSize);
            }
          }
          // 根据是否启用偏向锁,设置对象头信息
          if (UseBiasedLocking) {
            result->set_mark(ik->prototype_header());
          } else {
            result->set_mark(markOopDesc::prototype());
          }
          result->set_klass_gap(0);
          result->set_klass(k_entry);
          // 将对象引用入栈,继续执行下一条指令
          SET_STACK_OBJECT(result, 0);
          UPDATE_PC_AND_TOS_AND_CONTINUE(3, 1);
        }
      }
    }

Third, access the location of the object

  Create an object is to use an object, our Java program needs to manipulate objects on the heap by reference specific data on the stack. Because reference types inside the Java Virtual Machine specification only specifies a reference to the object, and is not defined by reference to what should be ways to locate, access to the specific location of the object heap, objects also depends on the way virtual machines access realization dependent. Mainstream have access method uses the handle and direct pointer two kinds. 
  If then the handle access, Java heap will be divided into a memory cell as a handle, the handle is stored in the reference address of the object, and the handle contains detailed information of each object instance data and the address data type. As shown in Figure 1. 
 

 
FIG 1 by accessing the object handle


  If direct access pointer, then the Java heap layout object must be considered the type of access information of how to place the data, it is directly stored in Reference target address, as shown in FIG. 

 
2 through FIG direct pointer to access the object


  Only changes in the handle both object access methods have advantages, use the handle to access the greatest benefit is that reference is stable in storage handler address, the object is moved in the (moving object when garbage collection is a very common behavior) when examples of the data pointer, and reference itself does not need to be modified. 
  Direct pointer to access the greatest advantage is faster, it saves time overhead of a pointer is positioned, due to very frequent in Java object access, so these small achievements and more overhead product is a very significant implementation costs. As can be seen from the object memory layout to explain a part of it HotSpot virtual machine, it is the way to use the second object access, but in the entire range of software development point of view, a variety of languages, frameworks use the handle to access the situation is very common. 

Fourth, example

In the Hotspot JVM, a 32-bit machine, the size of the Integer object is int several times?

We all know that in the Java language specification has defined the size of an int is four bytes, then the size of the Integer object is how much? To know the size of an object, you must need to know the structure of objects in a virtual machine is how, according to the above chart, we can draw objects Integer is structured as follows:

 

Only one member of variable Integer int type value, so the size of the actual data portion of the object is a four-byte, 4-byte padding and then reaches behind the 8-byte aligned, it is possible to obtain the Integer object size is 16 bytes.

Therefore, we can conclude that the size of the Integer object is four times int type native .

About the object memory structure, the memory array is to be noted and the general structure of the object memory structure is slightly different, because the data length field has a length, so that after the object header also more than one type of int length field, 4 bytes, next is the data array, as shown below:

 

Published 136 original articles · won praise 6 · views 1526

Guess you like

Origin blog.csdn.net/weixin_42073629/article/details/104489252