On the Java memory model and interaction

A, Java run-time area

In Java virtual machine runtime area is divided into six kinds as:

On the Java memory model and interaction

  1. Program Counter: used to record the current thread execution where one step. In multi-threaded mode of rotation, when the current thread time slice expires recording step where the current operation, before the operation time to regain the recording sheet according to recover.
  2. VM stack: This is what we usually call stack, and is generally used to store local variables table, operand table, dynamic links.
  3. Native method stacks: This is another stack, used to provide local service in the virtual machine used as start method of thread, JUC bag CAS and other frequently used methods are coming from.
  4. Heap: The main storage area, objects are usually created in the region. Its interior is also divided into the Cenozoic years old and permanent generations (that is, the method area, deleted after Java8 a), the new generation is divided into two Survivor and a piece of Eden, usually created objects are actually created in the Eden area but after talking to the garbage collector wrote in an article.
  5. Method Area: storing symbolic references, the JVM loads class information, local static variable. The method area is removed after Java8, using meta-class space to store information, the constant pool and other things have been moved to the heap (In fact, when 7 and static variables constant pool has been moved to the heap), no more on behalf of a permanent say. The reason deleted as follows:

  6. Easily lead to memory overflow or memory leaks, for example in the case of more JSP pages web development.
  7. Because information classes and methods is difficult to determine, good sized, too large influence the old generation, memory overflow too easily.

  8. GC good handling, low recovery efficiency, tuning is difficult.

  9. The constant pool: storing final modification of the member variables, string (e.g. Sring s = "test"; this) defined directly there are six types of data corresponding to the type of packaging objects from -128 to 127 (This also explains our new two types of objects in the packaging of this range, why they are the same, Boolean type is stored in true and false are two kinds of floating-point types Double and Float because accuracy problems which are not stored), etc.

 In the six types above, the first three are private to the thread , that value is stored inside the other thread is not visible, but the last three (true sense only stack one) is shared between threads and there is a variable for each thread is visible. As shown below, the first three thread stored in memory, we are independent of each other, it can be understood as the main memory heap memory (in fact only part of the heap data object instance in memory, for example, filling the other objects and object header the data is not included), is shared between the threads:

On the Java memory model and interaction

Second, the interaction between the variables Java memory

Here the variable refers to a variable can be placed on the stack , other such as local variables, the parameters of these methods is not into account. Thread memory with the main memory between the variable interaction is very important, Java virtual machine to interact with these norms into the following eight kinds of operations , each of which is atomic (non-volatile modified except Double and Long) operate.

  1. Lock (lock) Operation: operated object is the thread, the main role of the object variable memory, when a variable is locked, another thread only after the current thread and unlocking can use, other threads can not unlock the operating variables.
  2. Unlock(解锁)操作:同上,线程操作,作用于主内存变量,令一个被锁住的变量解锁,使得其他线程可以对此变量进行操作,不能对未锁住的变量进行解锁操作。
  3. Read(读):线程从主内存读取变量值,load操作根据此读取的变量值为线程内存中的变量副本赋值。
  4. Load(加载):将Read读取到的变量值赋到线程内存的副本中,供线程使用。
  5. Use(使用):读取线程内存的作用值,用来执行我们定义的操作。
  6. Assign(赋值):在线程操作中变量的值进行了改变,使用此操作刷新线程内存的值。
  7. Store(储存):将当前线程内存的变量值同步到主内存中,与write操作一起作用。
  8. Write(写):将线程内存中store的值写入到主内存中,主内存中的变量值进行变更。

可能有人会不理解read和load、store和write的区别,觉得这两对的操作类似,可以把其当做一个是申请操作,另一个是审核通过(允许赋值)。例如:线程内存A向主内存提交了变更变量的申请(store操作),主内存通过之后修改变量的值(write操作)。如下图:

On the Java memory model and interaction

参照《深入理解Java虚拟机》

On the Java memory model and interaction

对于普通的变量来说(非volatile修饰的变量),虚拟机要求read、load有相对顺序即可,例如从主内存读取i、j两个变量,可能的操作是read i=>read j=>load j=> load i,并不一定是连续的。此外虚拟机还为这8种操作定制了操作的规则:

  • (read,load)、(store,write)不允许出现单独的操作。也就是说这两种操作一定是以组的形式出现的,有read就有load,有store就有write,不能读取了变量值而不加载到线程内存中,也不能储存了变量值而不写到主内存中。
  • 不允许线程放弃最近的assign操作。也就是说当线程使用assign操作对私有内存的变量副本进行了变更的时候,其必须使用write操作将其同步到主内存当中去。
  • 不允许一个线程无原因地(没有进行assign操作)将私有内存的变量同步到主内存中。
  • 变量必须从主内存产生,即不允许在私有内存中使用未初始化(未进行load或者assgin操作)的变量。也就是说,在use之前必须保证执行了load操作,在store之前必须保证执行了assign操作,例如有成员变量a和局部变量b,如果想进行a = b的操作,必须先初始化b。(一开始说了,变量指的是可以放在堆内存的变量)
  • 一个变量一次只能同时允许一个线程对其进行lock操作。一个主内存的变量被一个线程使用lock操作之后,在这个线程执行unlock操作之前,其他线程不能对此变量进行操作。但是一个线程可以对一个变量进行多次锁,只要最后释放锁的次数和加锁的次数一致才能解锁。
  • 当线程使用lock操作时,清除所有私有内存的变量副本。
  • 使用unlock操作时,必须在此操作之前将变量同步到主内存当中。
  • 不允许对没有进行lock操作的变量执行unlock操作,也不允许线程去unlock其他线程lock的变量。

三、改变规则的Volatile关键字

对于关键字volatile,大家都知道其一般作为并发的轻量级关键字,并且具有两个重要的语义

  1. 保证内存的可见性:使用volatile修饰的变量在变量值发生改变的时候,会立刻同步到主内存,并使其他线程的变量副本失效。
  2. 禁止指令重排序:用volatile修饰的变量在硬件层面上会通过在指令前后加入内存屏障来实现编译器级别则是通过下面的规则实现。

  这两个语义都是因为JMM对于volatile关键字修饰的变量会有特殊的规则:

  1. 在对变量执行use操作之前,其前一步操作必须为对该变量的load操作;在对变量执行load操作之前,其后一步操作必须为该变量的use操作。也就是说,使用volatile修饰的变量其read、load、use都是连续出现的,所以每次使用变量的时候都要从主内存读取最新的变量值,替换私有内存的变量副本值(如果不同的话)。
  2. 在对变量执行assign操作之前,其后一步操作必须为store;在对变量执行store之前,其前一步必须为对相同变量的assign操作。也就是说,其对同一变量的assign、store、write操作都是连续出现的,所以每次对变量的改变都会立马同步到主内存中。
  3. 在主内存中有变量a、b,动作A为当前线程对变量a的use或者assign操作,动作B为与动作A对应load或store操作,动作C为与动作B对应的read或write操作;动作D为当前线程对变量b的use或assign操作,动作E为与D对应的load或store操作,动作F为与动作E对应的read或write操作;如果动作A先于动作D,那么动作C要先于动作F。也就是说,如果当前线程对变量a执行的use或assign操作在对变量buse或assign之前执行的话,那么当前线程对变量a的read或write操作肯定要在对变量b的read或write操作之前执行。

  从上面volatile的特殊规则中,我们可以知道1、2条其实就是volatile内存可见性的语义,第三条就是禁止指令重排序的语义。另外还有其他的一些特殊规则,例如对于非volatile修饰的double或者long这两个64位的数据类型中,虚拟机允许对其当做两次32位的操作来进行,也就是说可以分解成非原子性的两个操作,但是这种可能性出现的情况也相当的小。因为Java内存模型虽然允许这样子做,但却“强烈建议”虚拟机选择实现这两种类型操作的原子性,所以平时不会出现读到“半个变量”的情况。

  volatile不具备原子性

虽然volatile修饰的变量可以强制刷新内存,但是其并不具备原子性,稍加思考就可以理解,虽然其要求对变量的(read、load、use)、(assign、store、write)必须是连续出现,即以组的形式出现,但是这两组操作还是分开的。比如说,两个线程同时完成了第一组操作(read、load、use),但是还没进行第二组操作(assign、store、write),此时是没错的,然后两个线程开始第二组操作,这样最终其中一个线程的操作会被覆盖掉,导致数据的不准确。如下面代码:

public class TestForVolatile {

    public static volatile int i = 0;

    public static void main(String[] args) throws InterruptedException {
        // 创建四个线程,每个线程对i执行一定次数的自增操作
        new Thread(() -> {
            int k = 0;
            while (k++ < 10000) {
                i++;
            }
            System.err.println("线程" + Thread.currentThread().getName() + "执行完毕");
        }).start();
        new Thread(() -> {
            int k = 0;
            while (k++ < 10000) {
                i++;
            }
            System.err.println("线程" + Thread.currentThread().getName() + "执行完毕");
        }).start();
        new Thread(() -> {
            int k = 0;
            while (k++ < 10000) {
                i++;
            }
            System.err.println("线程" + Thread.currentThread().getName() + "执行完毕");
        }).start();
        new Thread(() -> {
            int k = 0;
            while (k++ < 10000) {
                i++;
            }
            System.err.println("线程" + Thread.currentThread().getName() + "执行完毕");
        }).start();
     // 睡眠一定时间确保四个线程全部执行完毕
        Thread.sleep(1000);
      // 最终结果为33555,没有预期的4W
        System.out.println(i);
      
    }

}

结果图:

On the Java memory model and interaction

解释一下:因为i++操作其实为i = i + 1,假设在主内存i = 99的时候同时有两个线程完成了第一组操作(read、load、use),也就是完成了等号后面变量i的读取操作,这时候是没问题的,然后进行运算,都得出i+1=100的结果,接着对变量i进行赋值操作,这就开始第二组操作(assign、store、write),是不是同时赋值的无所谓,这样一来,两个线程都会以i = 100把值写到主内存中,也就是说,其中一个线程的操作结果会被覆盖,相当于无效操作,这就导致上面程序最终结果的不准确。

如果要保证原子性的话可以使用synchronize关键字,其可以保证原子性内存可见性(但是不具备有禁止指令重排序的语义,这也是为什么double-check的单例模式中,实例要用volatile修饰的原因);当然你也可以使用JUC包的原子类AtomicInteger之类的。

四、先行发生原则(happens-before)

如果单靠volatilesynchronized来维持程序的有序性的话,那么难免会变得有些繁琐。然而大部分时候我们并不需要这样做,因为Java中有一个“先行发生原则”:如果操作A先行发生于操作B,那么进行B操作之前A操作的变化都能被B操作观察到,也就是说B能看到A对变量进行的修改。 这里的先后指的是执行顺序的先后,与时间无关。例如在下面伪代码中:

// 在线程A执行,定为A操作
i = 0;

// 线程B执行,定义为B操作
j = i;

// 线程C执行,定义为C操作
i = 1;

假设A操作先于B操作发生,暂时忽略C操作,那么最终得到的结果必定是i = j = 1;但是如果此时加入C操作,并且跟A、B操作没有确定先行发生关系,那么最终的结果就变成了不确定,因为C可能在B之前执行也可能在B之后执行,所以此时就会出现数据不准确的情况。如果一开始没有A操作先行于B操作这个前提的话,那么就算没有C操作,结果也是不确定的。

当然,符合先行发生原则的并不一定按照这个规则来执行,只有在操作之间会有依赖的时候(即下一个操作用到上个操作的变量),此时的先行发生原则才一定适用。例如在下面的伪代码中,虽然符合先行发生原则,但是也不保证能有序执行。

// 同一线程执行以下操作
// A操作
int i = 0;
// B操作
int j = 1;

这里完全符合程序次序规则(先行发生原则的一种),但是两个操作之间并没有依赖,所以虚拟机完全可以对其进行重排序,使得B操作在A操作之前执行,当然这对程序的正确性并没有影响。

  那么该如何判断是否符合先行发生原则呢?就连前面的例子都是通过假设来得出先行发生的。莫慌,Java内存模型为我们提供一些规则,只要符合这些规则之一,那就符合先行发生原则。可以类比为先行发生原则为接口,下面的规则则为实现此接口的实现类。

  • 程序次序规则:在同一个线程中,代码书写在前面的操作先行发生于书写在后面的操作。(以编译后的class文件为准)
  • 管程锁定规则:对于同一把锁,unlock操作总是先行发生于后面对此锁的lock操作之前。后面指的是时间上的顺序
  • volatile变量规则:对于volatile修饰的变量中,对此变量的写操作总是先行发生于后面对此变量的读操作。这里的后面同样指的是时间上的顺序。
  • 线程启动规则:一个线程的start()方法先行发生于该线程的每一个动作,也就是说线程的start()方法要先于该线程的run()方法中的任何操作。如下面例子,我在线程A中改变了共享变量i的值,然后在启动B线程,B线程中run方法是读取并打印i的值,执行1W次,最终的结果读取到的都为1:

    public static int i = 0;
    
    public static void main(String[] args) {
        for (int k = 0; k < 10000; k++) testThread();
    }
    
    public static void testThread() {
        Thread threadB = new Thread(() -> {
            System.err.println("线程B中i的值为:" + i);
            System.err.println("线程B执行结束");
        });
        new Thread(() -> {
            i = 1;
            // 在修改了共享变量i的值后,启动线程B
            threadB.start();
            System.err.println("线程A中执行完之后i的值为:" + i);
        }).start();
    }    

    结果图:

On the Java memory model and interaction

  • 线程中断规则:对线程interrupt()方法的调用先行发生于被中断线程的代码检测到代码中断时间的发生。
  • Thread terminates rules: all operations first thread in the thread termination detection, that is, prior to the join () method is executed. As in the following code, I i A thread execution in the shared variable increment 100W, and then perform the decrement 100W-1 performs about 1000, all the results are final join must be 1.

    public static int i = 0;
    
    public static void main(String[] args) throws InterruptedException {
        // 执行1000次
        for (int k = 0; k < 1000; k++) {
            i = 0;
            testThread();
        }
    }
    
    public static void testThread() throws InterruptedException {
        Thread threadA = new Thread(() -> {
            int k = 0;
            while (k++ < 100 * 100 * 100) {
                i++;
            }
            while (--k > 1) {
                i--;
            }
            System.err.println("线程A中执行完之后i的值为:" + i);
        });
        threadA.start();
        // 加上下面这段代码的话,join之前读到的i可能为0也可能大于0(不一定是1),原因是变量i主内存的read和write操作没有固定顺序
        // TimeUnit.NANOSECONDS.sleep(1);
        System.out.println("主线程中开启线程A后i的值为:" + i);
        // 线程A终止
        threadA.join();
        // join之后的结果一定为1
        System.err.println("Join之后i的值为:" + i);
    }

    Results Figure:

On the Java memory model and interaction

  • End of rule objects: initialization of an object is completed (execution constructor is finished) in advance of its finalize () method to start.
  • Transitive: if A operates the operation precedes B, B precedes C operation operation, then A C operation precedes operation.

  The 8 Java is provided without any natural rule the synchronizer, provided that they meet one of the eight, then in line with the principle of first occurrence; on the contrary, is not the case. It will be appreciated by the following example:

// 对象中有一个变量i
private int i = 0;
public int getI() {
    return i;
}

public void setI(int i) {
    this.i = i;
}
// 在线程A执行set操作A 
setI(1);

// 在线程B执行相同对象的get操作B 
int j = getI();

We assume A first operation performed on the time before then perform operations B, then B is the number i get it?

We will above rules one by one inside sets, different threads, program sequence rules OUT; no lock and the volatile keyword, the tube lock and volatile variable rules OUT; three rules and objects on the thread termination does not conform to the rules , OUT; last not to mention, OUT; summary, this operation does not comply with the principles occurs first, so this operation is not guaranteed, that is i variable B is obtained may have 1 to 0, that is, thread unsafe in. It is determined whether the thread safety is based on the principle of first occurrence, with the chronological order and not much relationship.

To correct this situation like the one above it, so that it can comply with a rule which, for example, coupled with volatile keyword or lock (same lock) can solve this problem.

Guess you like

Origin blog.51cto.com/14230003/2453660