1 Instruction Execution Design of Virtual Machine

1.1 Classification of virtual machines

Stack-based virtual machines, such as the JVM virtual machine

1.2 The concept of virtual machine

First ask a basic question, as a virtual machine, what are its most basic functions?

It should be able to simulate the physical CPU moving in and out of operands. Ideally, it should include the following concepts:

(1) Compile the source code into the bytecode specified by the VM.

(2) A data structure containing instructions and operands (instructions are used to process operands for operations).

(3) A call stack for all function operations.

(4) An "Instruction Pointer (Instruction Point—IP)": used to point to the next instruction to be executed.

(5) A virtual "CPU" - the dispatcher of instructions:

1) Instruction fetch: Get the next instruction (obtained by IP)

2) Decoding: Translate the instruction and what kind of operation will be performed.

3) Execution: Execute the instruction.

The above is the three-stage pipeline operation of the CPU. In fact, the five-stage pipeline also includes write-back, that is, the result generated after execution is written back into the memory.

There are two basic approaches to implementing virtual machines:

Stack-based and Register-based, such as Stack-based virtual machines include JVM, .net CLR, and the recently popular Ethereum EVM. This kind of Stack-based virtual machine implementation is a wide range of implementation methods. Register-based virtual machines include Lua VM (a virtual machine of the Lau programming language) and Dalvik VM.

The difference between the two virtual machine implementations is mainly in the storage and retrieval mechanisms for operands and results.

Note that because Dalvik is a register-based virtual machine, the format of the compiled result file is completely different from the class file

1.3 Core Concepts

1.3.1 The form of execution of the final instruction is

ADD P1 P2 --- P1 P2 is the operand (the essence of the operand is the address)

1.3.2 What the core needs to understand is that different chips have inconsistent circuit design specifications (including single-chip microcomputers)

1. Conventional CPU Intel ARM AMD, etc.: 32, 64

2. MCU: 8, 16

1.4 Stack-Based virtual machine (the biggest feature is a bunch of pop push)

Stack-Based virtual machine (the biggest feature is a bunch of pop push)

The following is a typical calculation process of calculating 20+7 in the stack:

        The instruction execution process is
           1, POP 20
           2, POP 7
           3, ADD // At this time, a single instruction is enough, no need to follow 20, 7, result
           4, PUSH result

1.5 Register-Based virtual machine (the biggest feature is more operands)

For register-based virtual machines, their operands are stored in CPU registers. There are no operations and concepts of stacking and popping. However, the executed instruction needs to contain the address of the operand, that is to say, the instruction must clearly contain the address of the operand, which is not like the stack that can be operated with a stack pointer. For example, the following addition operation:

1.5.1 Instruction execution process

ADD R1, R2, R3 ; just one instruction.

1.6 Summary

1.6.1 Because computer chip specifications are inconsistent

Chips like regular Intel ARM AMD can support more operations, because generally their circuit design is more than 32

Like a single-chip microcomputer, the cheapest single-chip microcomputer may be designed on the line with 8

1.6.2 Therefore, considering the compatible

JAVA program

Stack --- Compatible with minimum 8 instruction execution (compatible with MCU and other devices)

Android scheme

Register --- no compatibility is required, android devices generally rely on high-performance chips

1.7 Thinking?

Based on the previous article to understand the problems in the process of instruction execution, thread resources need to be considered

What are thread resources?

Reference: Thread Management of C++ chapter 11 STL

https://www.processon.com/view/link/62ab388e5653bb5256dae2fd

PS: It is best to read this class

The nature of thread resources

1. The final execution of the instruction is allocated by the OS

2. What the virtual machine needs to do is to compile the instructions and throw them to the kernel of the OS

3. Complete the command submission through the system interface of the OS

4. The kernel will organize the code when processing the code, that is, the task_struct object

5. From the perspective of the kernel, a task_struct is actually a process|thread

For Linux, there is not much difference between a process and a thread, but there is a little difference in information

6. From the perspective of the kernel, the form of pthread_create is provided to allow us to dynamically generate a thread resource. The essence is to tell linux through this function that buddies open a new task_struct, and the code is {specified by yourself}.

2 Review: GC implementation plan and ideas

2.1 Virtual machine memory management ideas

2.1.1 Description

The memory management of the virtual machine is essentially to apply for a piece of memory in advance, and then maintain this piece of memory by itself . The specific case is as follows

2.1.2 Examples

#include <stdio.h> #include <malloc.h> 
#define EDEN_SIZE 1024 * 1024 * 16 int main() {     
    //堆区初始化大小16M，实际开辟 16 * 3 48M     
    //其中16M为EDEN 32M为堆区     
    char* head = malloc(EDEN_SIZE * 3);     
    printf("申请空间位置%x ~ %x\n",&head,(&head + (EDEN_SIZE * 3)));     
    //年轻代设计分配     
    int yong_begin = &head ;     
    int yong_end = &head + EDEN_SIZE;     
    int eden_begin = &yong_begin;     
    int eden_end = &yong_begin + (int)(EDEN_SIZE * 0.8-1);     
    //survivor区分配     
    int survivor1_begin =&yong_begin + (int)(EDEN_SIZE * 0.8);     int survivor1_end = &yong_begin + (int)(EDEN_SIZE * 0.9 - 1);     int survivor2_begin =&yong_begin + (int)(EDEN_SIZE * 0.9);     int survivor2_end =&yong_begin + (int)(EDEN_SIZE);     //老年代设计     int old_begin = &head + EDEN_SIZE;     
    int old_end = old_begin + EDEN_SIZE * 3;     
    printf("eden空间分配区间%x ~ %x\n",eden_begin,eden_end);     
    printf("survivor1空间分配区间%x ~ %x\n",survivor1_begin,survivor1_end);     printf("survivor2空间分配区间%x ~ %x\n",survivor2_begin,survivor2_end);     printf("eden空间分配区间%x ~ %x\n",old_begin,old_end);     
    //java代码：int i = 10;     
    //指针标记     
    char* index = &head;     
    printf("当前地址%x\n",index);     
    int i = 20;    
    //length 4 value 20     
    char* val = 20;     
    int length = 4;     
    char* v1Index = index;     
    *index = val;     
    index += length;     
    printf("1当前分配后位置%x\n",index);     
    char* v2Index = index;     
    char* val2 = 30;     
    int length2 = 4;     
    *index = val2;     
    index += length2;     
    printf("2当前分配后位置%x\n",index);     
    int value = *v1Index;     
    printf("value:%d\n",value);     
    int value2 = *v2Index;     
    printf("value2:%d\n",value2);     
    return 0; 
}

2.1.3 Running Results

2.2 Memory Fragmentation Solution

2.2.1 This kind of self-writing, the first thing to consider is the problem of memory fragmentation

2.2.2 Traditional solutions generally rely on two stages

1. Mark the data objects to be processed

2. Empty the data object to be processed (PS: But for memory data management, it will not be done, only overwritten)

2.2.3 Processing Algorithms in the Industry

1. Garbage Confirmation Algorithm--Marking Phase Algorithm

1) Reference counting algorithm

2) GCRoot reachability analysis algorithm

2. Garbage removal algorithm--clearing stage algorithm

1) Clearing algorithm: directly delete

have fragmentation problem

2) Copy algorithm: sacrifice space, change fragments

No Fragmentation Issues

3) Compression/organization algorithm: sacrifice time, change fragments

No Fragmentation Issues

3. Comparison of performance indicators of the three algorithms

In terms of efficiency, the copy algorithm is the fastest, but it wastes the most memory. In order to take into account the above three indicators as much as possible, the mark sorting algorithm is relatively smooth, but the efficiency is not only arbitrary, it has one more mark stage than the copy algorithm, and more than the clear a defragmentation stage

2.3 Ideas of Generational Design

Isn't there an optimal algorithm?

2.3.1 Generational Collection Algorithm

In order to meet the optimal efficiency of garbage collection, the generational collection algorithm came into being

The generational collection algorithm is based on the fact that different objects have different life cycles. Therefore, objects with different life cycles can be collected in different ways to improve recycling efficiency. Generally, the heap is divided into the new generation and the old generation, so that different recycling algorithms can be used according to the characteristics of each age, and the efficiency is relatively improved.

In the process of system operation, a large number of objects will be generated, some of which are related to business information, such as HTTP request Session, thread, Socket connection and other objects. These objects are linked to business, so they have a long life cycle, and some of them are running processes. Summarize the generated temporary variables, these objects have a short life cycle, such as: String, these objects can be recycled even after only one use

2.3.2 Generational Algorithm Derivation

1) eden is used as the place of production, using the clearing algorithm

2) But there may be object data that needs to survive, so reserve a piece of data to save the survival data (old)

3) In order to ensure that no fragments are generated, and the volume of this piece of data is actually not large, the replication algorithm is used here

4) Pointer Collision!

5) Treatment plan:

 Maintain a free list to resolve fragmentation

 Algorithm: Clear, Copy, Organize Algorithms

 How to design this plan yourself

 70-99% - Disposable Young Adults

 1-30% long-term elderly

 Suggested proportion of old and new generations: 90% 10%

 Young generation

 eden (production) -survivor (buffer)

 old generation

 eden-clear algorithm all clear

 survivor - copy buffer

 old - cleaned up

2.4 Memory recovery scheme

2.4.1 Partitioning

2.4.2 Increment

3 Dart memory structure design

3.1 Design idea of stack area

3.1.1 JAVA

3.1.2 Dart

3.2.3 Description

1. The plan adopted by JAVA

The independent memory overhead of the stack area maintains the execution of instructions

Heap data sharing

advantage

No additional data overhead, shared data between threads directly finds the shared heap area

shortcoming

Because of sharing, there will be a critical section problem, and a lock mechanism needs to be introduced to deal with it

2. The solution adopted by Dart

GC isolation

As shown in the figure above, from the perspective of memory management design, the stack, heap, and other data are completely isolated, including GC threads.

2.2 Data transfer problem between threads

2.2.1 JAVA

Share the heap directly, but you need to design the lock business

2.2.2 DART

Independent stack, but the communication mechanism of the isolation area needs to be designed

4 Event polling mechanism

4.1 The purpose of establishing the event rotation training mechanism

Establish an asynchronous system inside a single thread

This solution is generally applicable to business systems driven by events (generally referring to UI systems)

The service system is generally multi-threaded

Please note the purpose of the design here:

1. The purpose of dart is to do the so-called cross-platform against JS

2. In fact, the essence of the subsequent cut-in to the mobile terminal will not change, because it is for cross-platform UI

3. Since it is UI development, the model of the entire business must be an event-driven model

4. In android, the handler is used to support the event-driven model

4.2 Program operation design model

4.2.1 Single-threaded model

In the single-threaded synchronization model, tasks are executed sequentially. If a task is blocked due to I/O, other tasks can only wait until the execution of the previous task is completed. Such three tasks without dependencies need to wait for each other to execute sequentially, which greatly reduces the execution efficiency.

illustrate

This is the state of a thread. Many single-chip microcomputers are like this. If there is a time-consuming operation, the follow-up will wait.

4.2.2 Multithreading Model

In the multithreading model, tasks are executed in independent threads without waiting for each other. Parallel processing is possible on multi-processor systems, and alternate execution is possible on single-processor systems. This method is more efficient, but developers need to protect shared resources from being accessed by multiple threads at the same time. Multithreaded programs are even more difficult to reason about because such programs have to deal with thread safety issues through thread synchronization mechanisms such as locks, reentrant functions, thread-local storage, or other mechanisms, which can lead to subtle and painful problems if implemented incorrectly. bug.

Typical multi-threaded state, if there is a branch function, let another thread perform maintenance

4.2.3 Event-driven model

In the event-driven model, the three tasks are executed alternately, but still in a thread of separate control. When processing an I/O or other expensive operation, register a callback to the event queue and continue execution when the I/O operation is complete. Callbacks describe how to handle an event. The event queue polls for events, and dispatches the event to the event handler for processing after the event occurs. This approach allows the program to execute as much as possible without requiring additional threads. Developers using the event-driven model do not need to worry about thread safety.

Running without waiting in a single thread

1. If the traditional single-thread needs to wait for the response of the event to process a certain business, it can only wait for one event, and the other needs to wait for this event to be processed before responding

4.3 Detailed Explanation of Event Rotation Training Mechanism

4.3.1 Dart implementation process

legend

illustrate

A Dart application has a message loop and two message queues -- an event queue and a microtask queue.

The event queue contains all incoming events: I/O, mouse events, drawing events, timers, messages between isolates, etc.

Microtask queues are necessary in Dart because sometimes event handlers want to complete some tasks later but before executing the next event message.

The event queue contains events from Dart and from elsewhere in the system. But the microtask queue only contains internal code from the current isolate.

As shown in the flowchart below, when the main method exits, the event loop starts its work. First, it will execute micro tasks in FIFO order, and when all micro tasks are executed, it will fetch events from the event queue and execute them. Repeat this until both queues are empty.

When a Dart application starts, its main isolate executes the main method. When the main method exits, the main isolate thread will process the messages in the message queue one by one.

When a Dart application starts, its main isolate executes the main method. When the main method exits, the main isolate thread will process the messages in the message queue one by one

4.3.2 JAVA simulation example

Task

package com.kerwin.event; 
public abstract class Task {     
    public abstract void run(); 
}

ScheduleMicrotask

package com.kerwin.event; 
public class ScheduleMicrotask extends Task {     
    @Override     
    public void run() {} 
}

Timer

package com.kerwin.event; 
public class Timer extends Task {     
    @Override     
    public void run() {} 
}

Main

package com.kerwin.event; 
import java.util.Queue; 
import java.util.concurrent.LinkedBlockingQueue; 

public class Main {     
    //微任务队列-优先级高
    private static Queue<ScheduleMicrotask> scheduleMicrotasks 
= new LinkedBlockingQueue<>();         
    //事件队列-优先级低     
    private static Queue<Timer> timers = new LinkedBlockingQueue<>();     
    public static void processAsync(){
        while(!scheduleMicrotasks.isEmpty() || !timers.isEmpty()){
            Task task = null;             
            if((task = scheduleMicrotasks.poll()) != null){
            }else if((task = timers.poll()) != null){}             
            task.run();         
        }     
    }   
  
    public static void main(String[] args){         
        System.out.println("main start!");         
        timers.offer(new Timer(){            
            @Override             
            public void run() {                 
                System.out.println("timer - event - A");                 
                scheduleMicrotasks.offer(new ScheduleMicrotask(){                     
                    @Override                     
                    public void run() {                         
                        System.out.println("ScheduleMicrotask - A - in Timer A");      
                    }                 
                });                 

                scheduleMicrotasks.offer(new ScheduleMicrotask(){                     
                    @Override                     
                    public void run() {                         
                        System.out.println("ScheduleMicrotask - B - in Timer A");
                    }                 
                });             
            }         
        });       
 
        scheduleMicrotasks.offer(new ScheduleMicrotask(){             
            @Override             
            public void run() {                 
                System.out.println("ScheduleMicrotask - C - in MAIN ");                 
                timers.offer(new Timer(){                     
                    @Override                     
                    public void run() {                         
                        System.out.println("timer - event - B - in ScheduleMicrotask - C ");                     
                    }                 
                });             
            }         
        });         
        System.out.println("main end!");         
        processAsync();     
    } 
}

4.4 Examples

import 'dart:async';

void main(){
  print("main begin");

  Timer.run(() {
    print("timer - event - A");

    scheduleMicrotask(() {
      print("ScheduleMicrotask - A - in Timer A");
    });

    scheduleMicrotask(() {
      print("ScheduleMicrotask - B - in Timer A");
    });

  });

  scheduleMicrotask(() {
    print("ScheduleMicrotask - C - in MAIN ");

    Timer.run(() {
      print("timer - event - B - in ScheduleMicrotask - C ");
    });
  });

  print("main end");
}

[Flutter entry to advanced] Dart advanced articles --- DartVM single-threaded design principle

1 Instruction Execution Design of Virtual Machine

1.1 Classification of virtual machines

1.2 The concept of virtual machine

1.3 Core Concepts

1.3.1 The form of execution of the final instruction is

1.3.2 What the core needs to understand is that different chips have inconsistent circuit design specifications (including single-chip microcomputers)

1.4 Stack-Based virtual machine (the biggest feature is a bunch of pop push)

1.5 Register-Based virtual machine (the biggest feature is more operands)

1.5.1 Instruction execution process

1.6 Summary

1.6.1 Because computer chip specifications are inconsistent

1.6.2 Therefore, considering the compatible

1.7 Thinking?

2 Review: GC implementation plan and ideas

2.1 Virtual machine memory management ideas

2.1.1 Description

2.1.2 Examples

2.1.3 Running Results

2.2 Memory Fragmentation Solution

2.2.1 This kind of self-writing, the first thing to consider is the problem of memory fragmentation

2.2.2 Traditional solutions generally rely on two stages

2.2.3 Processing Algorithms in the Industry

2.3 Ideas of Generational Design

2.3.1 Generational Collection Algorithm

2.3.2 Generational Algorithm Derivation

2.4 Memory recovery scheme

2.4.1 Partitioning

2.4.2 Increment

3 Dart memory structure design

3.1 Design idea of ​​stack area

3.1.1 JAVA

3.1.2 Dart

3.2.3 Description

2.2 Data transfer problem between threads

2.2.1 JAVA

2.2.2 DART

4 Event polling mechanism

4.1 The purpose of establishing the event rotation training mechanism

4.2 Program operation design model

4.2.1 Single-threaded model

4.2.2 Multithreading Model

4.2.3 Event-driven model

4.3 Detailed Explanation of Event Rotation Training Mechanism

4.3.1 Dart implementation process

4.3.2 JAVA simulation example

4.4 Examples

Guess you like

3.1 Design idea of stack area