The thread principle in Android concurrent programming

1. The concept of process and thread

Putting aside those official concepts, we can roughly understand it as: a process is an application running on a mobile phone, and they are all processes one by one (of course, some apps are multi-process, let’s not talk about this first). A thread is the execution control flow of a corresponding task in a process. If a process is compared to a workshop, then each production line in the workshop can be regarded as a thread.

The concept of high-speed buffer:

In the earliest days, computers only had processes. At the same time, a process operates on memory. But this is less efficient. For example, if we want to process a file read and write operation, we need to wait for the operation to complete before we can do other things. After the concept of concurrency comes out, multiple processes can be switched in rotation to achieve "simultaneous" memory operations in a relative sense.

But there is a problem in this way, that is, switching between processes requires a lot of resources. Therefore, in order to be more efficient, threads are designed, and the data in the memory is stored in the high-speed buffer. Each thread has its corresponding high-speed buffer, and the thread accesses data in the high-speed buffer, so that the original inter-process Switching becomes switching between threads, which is more lightweight, thereby reducing resource consumption.

Each thread has its own high-speed buffer, which is a register. It can be understood as the memory in the CPU. A core represents a thread, so assuming a 4-core 8G mobile phone, a 2-core has 2G.

We know that the java files we usually write turn the .java file into a .class file through javac, and then load the .class file into the memory through the class loader, classLoader, and jvm is running this .class bytecode file When, some memory partitioning will be performed.

There are two main divisions here, one is the thread shared area and the other is the thread exclusive area. The shared area includes the method area and the heap area, while the exclusive area includes the java virtual machine stack, local method stack, and program calculator. So everyone remembers that the content in the heap is shared, while the content in the stack is exclusive. Therefore, the data we define on the virtual machine stack cannot be accessed casually. If we modify the data with final, copy the data from the virtual machine stack of thread A to the method area, so that thread 2 can access the data from the method area.

Thread life cycle:

Thread -> Runnable ->Runing ->Terminated.

After Running, you can also enter the waiting state through the wait() method. Threads in the waiting state can be awakened by the notify() or notifyAll() method to re-enter the Runnable() state.

In the absence of a lock, when a Thread is created and start() is called, it becomes the Runable() state, which is executable. Then when the thread preempts the time slice, it becomes the Running state. That is, it is already in execution state. When the execution is completed, it reaches Termianted, which is the end. If the wait state is called before the end, it will enter the waiting state. Then wait until other threads wake him up, that is, call notify or notifyAll. Then enter the runnable state. At this time, if the time slice is preempted, it will enter the running state again.

Of course, this is without locks. If it involves locks, it is actually very simple, that is, there is an extra blocking state. blocked.

If it involves the state of the lock, when the thread grabs the lock, other threads will enter the Blocked state. After the thread releases the lock, those blocked threads will enter the Runnable state after getting the lock.

This is a state diagram for a thread. It should be noted that, for example, if we call Thread.sleep, it will enter the Timed_waiting state, which is to wait for a limited time state.

We often create a new thread. This thread is actually a thread of the Linux system at the bottom. We can track it to see its start() method:

    public synchronized void start() {
        /**
         * This method is not invoked for the main method thread or "system"
         * group threads created/set up by the VM. Any new functionality added
         * to this method in the future may have to also be added to the VM.
         *
         * A zero status value corresponds to state "NEW".
         */
        // Android-changed: Replace unused threadStatus field with started field.
        // The threadStatus field is unused on Android.
        if (started)
            throw new IllegalThreadStateException();

        /* Notify the group that this thread is about to be started
         * so that it can be added to the group's list of threads
         * and the group's unstarted count can be decremented. */
        group.add(this);

        // Android-changed: Use field instead of local variable.
        // It is necessary to remember the state of this across calls to this method so that it
        // can throw an IllegalThreadStateException if this method is called on an already
        // started thread.
        started = false;
        try {
            // Android-changed: Use Android specific nativeCreate() method to create/start thread.
            // start0();
            nativeCreate(this, stackSize, daemon);
            started = true;
        } finally {
            try {
                if (!started) {
                    group.threadStartFailed(this);
                }
            } catch (Throwable ignore) {
                /* do nothing. If start0 threw a Throwable then
                  it will be passed up the call stack */
            }
        }
    }

We see a key line in the try code block:

  // Android-changed: Use Android specific nativeCreate() method to create/start thread.
            // start0();
            nativeCreate(this, stackSize, daemon);

As you can see here, this is android modifying this method and creating it locally.

// Android-changed: Use Android specific nativeCreate() method to create/start thread.
    // The upstream native method start0() only takes a reference to this object and so must obtain
    // the stack size and daemon status directly from the field whereas Android supplies the values
    // explicitly on the method call.
    // private native void start0();
    private native static void nativeCreate(Thread t, long stackSize, boolean daemon);

According to the previous note, let's translate:

//Android changed this method to create/start threads using the android-specific nativeCreate() method.

//The upstream native method start0() only references this object, so it must be obtained

// Stack size and daemon state come from values provided by android.

// show execution on this method

//Private native method start0()

Before android8.0, the start0 method was used, and then the natvieCreate method was used.

We trace the native method start0():

static JNINativeMethod methods[] = {
    {"start0",           "(JZ)V",        (void *)&JVM_StartThread},
    {"setPriority0",     "(I)V",       (void *)&JVM_SetThreadPriority},
    {"yield",            "()V",        (void *)&JVM_Yield},
    {"sleep",            "(Ljava/lang/Object;J)V",       (void *)&JVM_Sleep},
    {"currentThread",    "()" THD,     (void *)&JVM_CurrentThread},
    {"interrupt0",       "()V",        (void *)&JVM_Interrupt},
    {"isInterrupted",    "(Z)Z",       (void *)&JVM_IsInterrupted},
    {"holdsLock",        "(" OBJ ")Z", (void *)&JVM_HoldsLock},
    {"setNativeName",    "(" STR ")V", (void *)&JVM_SetNativeThreadName},
};

You can see the start0 method, which is mapped to the JVM_StartThread() method. This method can be seen as a function in the JVM through the method name. Found this method:

JVM_ENTRY(void, JVM_StartThread(JNIEnv* env, jobject jthread))
  JVMWrapper("JVM_StartThread");
  JavaThread *native_thread = NULL;

  // We cannot hold the Threads_lock when we throw an exception,
  // due to rank ordering issues. Example:  we might need to grab the
  // Heap_lock while we construct the exception.
  bool throw_illegal_thread_state = false;

  // We must release the Threads_lock before we can post a jvmti event
  // in Thread::start.
  {
    // Ensure that the C++ Thread and OSThread structures aren't freed before
    // we operate.
    MutexLocker mu(Threads_lock);

    // Since JDK 5 the java.lang.Thread threadStatus is used to prevent
    // re-starting an already started thread, so we should usually find
    // that the JavaThread is null. However for a JNI attached thread
    // there is a small window between the Thread object being created
    // (with its JavaThread set) and the update to its threadStatus, so we
    // have to check for this
    if (java_lang_Thread::thread(JNIHandles::resolve_non_null(jthread)) != NULL) {
      throw_illegal_thread_state = true;
    } else {
      // We could also check the stillborn flag to see if this thread was already stopped, but
      // for historical reasons we let the thread detect that itself when it starts running

      jlong size =
             java_lang_Thread::stackSize(JNIHandles::resolve_non_null(jthread));
      // Allocate the C++ Thread structure and create the native thread.  The
      // stack size retrieved from java is signed, but the constructor takes
      // size_t (an unsigned type), so avoid passing negative values which would
      // result in really large stacks.
      size_t sz = size > 0 ? (size_t) size : 0;
      native_thread = new JavaThread(&thread_entry, sz);

      // At this point it may be possible that no osthread was created for the
      // JavaThread due to lack of memory. Check for this situation and throw
      // an exception if necessary. Eventually we may want to change this so
      // that we only grab the lock if the thread was created successfully -
      // then we can also do this check and throw the exception in the
      // JavaThread constructor.
      if (native_thread->osthread() != NULL) {
        // Note: the current thread is not being used within "prepare".
        native_thread->prepare(jthread);
      }
    }
  }

Let's focus on here:

 jlong size =
             java_lang_Thread::stackSize(JNIHandles::resolve_non_null(jthread));
      // Allocate the C++ Thread structure and create the native thread.  The
      // stack size retrieved from java is signed, but the constructor takes
      // size_t (an unsigned type), so avoid passing negative values which would
      // result in really large stacks.
      size_t sz = size > 0 ? (size_t) size : 0;
      native_thread = new JavaThread(&thread_entry, sz);

Translate the note above:

// Allocate a C++ thread structure and create a native thread. This stack, retrieved from java, is signed. But the constructor requires unsigned. So avoid passing negative values, resulting in a very large stack size.

(With a signed bit, the first bit is the sign bit. If the sign bit is passed to the unsigned representation, the first sign bit will also be regarded as a number, which will cause the data to be very large)

A Java thread is created here, and the size of the virtual machine stack is also passed in. So the size of our virtual machine stack (in the high-speed buffer) is also the default size of the thread. Let's continue to trace this JavaThread() method:

JavaThread::JavaThread(ThreadFunction entry_point, size_t stack_sz) :
  Thread()
#if INCLUDE_ALL_GCS
  , _satb_mark_queue(&_satb_mark_queue_set),
  _dirty_card_queue(&_dirty_card_queue_set)
#endif // INCLUDE_ALL_GCS
{
  if (TraceThreadEvents) {
    tty->print_cr("creating thread %p", this);
  }
  initialize();
  _jni_attach_state = _not_attaching_via_jni;
  set_entry_point(entry_point);
  // Create the native thread itself.
  // %note runtime_23
  os::ThreadType thr_type = os::java_thread;
  thr_type = entry_point == &compiler_thread_entry ? os::compiler_thread :
                                                     os::java_thread;
  os::create_thread(this, thr_type, stack_sz);
  _safepoint_visible = false;
  // The _osthread may be NULL here because we ran out of memory (too many threads active).
  // We need to throw and OutOfMemoryError - however we cannot do this here because the caller
  // may hold a lock and all locks must be unlocked before throwing the exception (throwing
  // the exception consists of creating the exception object & initializing it, initialization
  // will leave the VM via a JavaCall and then all locks must be unlocked).
  //
  // The thread is still suspended when we reach here. Thread must be explicit started
  // by creator! Furthermore, the thread must also explicitly be added to the Threads list
  // by calling Threads:add. The reason why this is not done here, is because the thread
  // object must be fully initialized (take a look at JVM_Start)
}

You can see that the method of creating a thread here is os::create_thread() The size of the thread passed here is the stack size staek_sz, continue to look at this method:

bool os::create_thread(Thread* thread, ThreadType thr_type, size_t stack_size) {
  assert(thread->osthread() == NULL, "caller responsible");

  ...

  // stack size
  if (os::Linux::supports_variable_stack_size()) {
    // calculate stack size if it's not specified by caller
    if (stack_size == 0) {
      stack_size = os::Linux::default_stack_size(thr_type);

      switch (thr_type) {
      case os::java_thread:
        // Java threads use ThreadStackSize which default value can be
        // changed with the flag -Xss
        assert (JavaThread::stack_size_at_create() > 0, "this should be set");
        stack_size = JavaThread::stack_size_at_create();
        break;
      case os::compiler_thread:
        if (CompilerThreadStackSize > 0) {
          stack_size = (size_t)(CompilerThreadStackSize * K);
          break;
        } // else fall through:
          // use VMThreadStackSize if CompilerThreadStackSize is not defined
      case os::vm_thread:
      case os::pgc_thread:
      case os::cgc_thread:
      case os::watcher_thread:
        if (VMThreadStackSize > 0) stack_size = (size_t)(VMThreadStackSize * K);
        break;
      }
    }

    stack_size = MAX2(stack_size, os::Linux::min_stack_allowed);
    pthread_attr_setstacksize(&attr, stack_size);
  } else {
    // let pthread_create() pick the default value.
  }

  ...

    pthread_t tid;
    int ret = pthread_create(&tid, &attr, (void* (*)(void*)) java_start, thread);

    pthread_attr_destroy(&attr);

    ...
}

It can be seen that in this method, the comment shows that the thread size of java is ThreadStatckSzie by default. It can be changed by setting flag -Xs.

This default size is defined by a constant:

const int os::Linux::_vm_default_page_size = (8 * K);

That is, the default 8K

So we can default to if the default size of the virtual machine stack is 8K, so the default size of the high-speed buffer is also 8K.

Note that the pthread_create method is called later in this method. This method is the method used by linux and unit to create threads, so the ultimate essence of creating threads in Android is still threads belonging to the linux system.

Next, let's look at the nativeCreate() method, let's track it down. In the java_lang_Thread.cc file:

static JNINativeMethod gMethods[] = {
  FAST_NATIVE_METHOD(Thread, currentThread, "()Ljava/lang/Thread;"),
  FAST_NATIVE_METHOD(Thread, interrupted, "()Z"),
  FAST_NATIVE_METHOD(Thread, isInterrupted, "()Z"),
  NATIVE_METHOD(Thread, nativeCreate, "(Ljava/lang/Thread;JZ)V"),
  NATIVE_METHOD(Thread, nativeGetStatus, "(Z)I"),
  NATIVE_METHOD(Thread, nativeHoldsLock, "(Ljava/lang/Object;)Z"),
  FAST_NATIVE_METHOD(Thread, nativeInterrupt, "()V"),
  NATIVE_METHOD(Thread, nativeSetName, "(Ljava/lang/String;)V"),
  NATIVE_METHOD(Thread, nativeSetPriority, "(I)V"),
  FAST_NATIVE_METHOD(Thread, sleep, "(Ljava/lang/Object;JI)V"),
  NATIVE_METHOD(Thread, yield, "()V"),
};

Note this line:

NATIVE_METHOD(Thread, nativeCreate, "(Ljava/lang/Thread;JZ)V"), navtiveCreate is mapped to the Thread_nativeCreate function in the java_lang_thread.cc file:

static void Thread_nativeCreate(JNIEnv* env, jclass, jobject java_thread, jlong stack_size,
                                jboolean daemon) {
  // There are sections in the zygote that forbid thread creation.
  Runtime* runtime = Runtime::Current();
  if (runtime->IsZygote() && runtime->IsZygoteNoThreadSection()) {
    jclass internal_error = env->FindClass("java/lang/InternalError");
    CHECK(internal_error != nullptr);
    env->ThrowNew(internal_error, "Cannot create threads in zygote");
    return;
  }

  Thread::CreateNativeThread(env, java_thread, stack_size, daemon == JNI_TRUE);
}

Finally, the Thread::CreateNativeThread function is called, which is in the Thread.cc file:

void Thread::CreateNativeThread(JNIEnv* env, jobject java_peer, size_t stack_size, bool is_daemon) {
  CHECK(java_peer != nullptr);
  Thread* self = static_cast<JNIEnvExt*>(env)->self;

  ...

  Thread* child_thread = new Thread(is_daemon);
  // Use global JNI ref to hold peer live while child thread starts.
  child_thread->tlsPtr_.jpeer = env->NewGlobalRef(java_peer);
  stack_size = FixStackSize(stack_size);

  ...

  int pthread_create_result = 0;
  if (child_jni_env_ext.get() != nullptr) {
    pthread_t new_pthread;
    pthread_attr_t attr;
    child_thread->tlsPtr_.tmp_jni_env = child_jni_env_ext.get();
    CHECK_PTHREAD_CALL(pthread_attr_init, (&attr), "new thread");
    CHECK_PTHREAD_CALL(pthread_attr_setdetachstate, (&attr, PTHREAD_CREATE_DETACHED),
                       "PTHREAD_CREATE_DETACHED");
    CHECK_PTHREAD_CALL(pthread_attr_setstacksize, (&attr, stack_size), stack_size);
    pthread_create_result = pthread_create(&new_pthread,
                                           &attr,
                                           Thread::CreateCallback,
                                           child_thread);
    CHECK_PTHREAD_CALL(pthread_attr_destroy, (&attr), "new thread");

    if (pthread_create_result == 0) {
      // pthread_create started the new thread. The child is now responsible for managing the
      // JNIEnvExt we created.
      // Note: we can't check for tmp_jni_env == nullptr, as that would require synchronization
      //       between the threads.
      child_jni_env_ext.release();
      return;
    }
  }

There is this sentence:

stack_size = FixStackSize(stack_size);

This gets the stack size, let's look at the FixStackSize method:

static size_t FixStackSize(size_t stack_size) {
  // A stack size of zero means "use the default".
  if (stack_size == 0) {
    stack_size = Runtime::Current()->GetDefaultStackSize();
  }

  // Dalvik used the bionic pthread default stack size for native threads,
  // so include that here to support apps that expect large native stacks.
  stack_size += 1 * MB;

  ...

  return stack_size;
}

As you can see here, the default size is 1M, which is much larger than before.

Let's look back at the above Thread::CreateNativeThread method, this code:

pthread_create_result = pthread_create(&new_pthread,
                                           &attr,
                                           Thread::CreateCallback,
                                           child_thread);

Here, finally, the pthread_create method is called again. It is also the thread creation mechanism of LINUX.

Next, let's look at the principle of synchrozie (this is very important):

Let's simulate the situation where two threads increment a at the same time:

package com.example.myapplication.Thread;

public class LockRunnable implements Runnable{
    private static int a = 0;
    @Override
    public void run() {
        for (int i= 0; i< 100000;i++){
            a ++;
        }
    }

    public static void main(String[] args) {
        LockRunnable lockRunnable = new LockRunnable();
        Thread thread1 = new Thread(lockRunnable);
        Thread thread2 = new Thread(lockRunnable);
        thread1.start();
        thread2.start();
        try{
            thread2.join();
            thread1.join();
        }catch (Exception e){
            e.printStackTrace();
        }
        System.out.println(a);
    }
}

Let's guess what the result is:

Not 20000, but 118161.

This is the thread unsafe problem. What is the reason? Let's analyze:

In fact, the two threads are not synchronized. That is, thread 2 did not wait for thread 1 to finish executing the run method, and added it to execute halfway. If thread 1, after auto-incrementing a, has not written the increased value to the method area, the value obtained by thread 2 from the method area is still before the auto-increment. For example, the previous value was 1, thread 1, after self-increment, becomes 2. But it is not updated to the method area. What thread 2 gets is still 1. Thread 2 adds 1 to it to become 2. Then it writes to the method area. The value of a becomes 2. Then thread 1 writes the newly incremented value 2 to the method area, and the value of a is still 2. Here. a has increased by 2 times, but only changed from 1 to 2.

At this time, let's try locking:

  @Override
    public void run() {
        for (int i= 0; i< 100000;i++){
            add();
        }
    }

    private synchronized static void add(){
        a ++;
    }

Take a look at the results:

Note that what is locked here is the object that calls this method. Locking the same object only works.

Analyze the synchronized keyword. If you lock a method, you can see some instructions in the final compiled bytecode file:

monitor-enter v1 和 monitor -exit v1。

Let's look at the functions about this monitor in the virtual machine:

IRT_ENTRY_NO_ASYNC(void, InterpreterRuntime::monitorenter(JavaThread* thread, BasicObjectLock* elem))
#ifdef ASSERT
  thread->last_frame().interpreter_frame_verify_monitor(elem);
#endif
  if (PrintBiasedLockingStatistics) {
    Atomic::inc(BiasedLocking::slow_path_entry_count_addr());
  }
  Handle h_obj(thread, elem->obj());
  assert(Universe::heap()->is_in_reserved_or_null(h_obj()),
         "must be NULL or an object");
  if (UseBiasedLocking) {
    // Retry fast entry if bias is revoked to avoid unnecessary inflation
    ObjectSynchronizer::fast_enter(h_obj, elem->lock(), true, CHECK);
  } else {
    ObjectSynchronizer::slow_enter(h_obj, elem->lock(), CHECK);
  }
  assert(Universe::heap()->is_in_reserved_or_null(elem->obj()),
         "must be NULL or an object");
#ifdef ASSERT
  thread->last_frame().interpreter_frame_verify_monitor(elem);
#endif
IRT_END

It can be seen that it means whether UseBiasedLocking is a biased lock, and if so, execute the logic of quickly obtaining the lock.

We know that if we modify a method with the sychronized keyword, when a thread calls the method, it will acquire the lock object and then execute it. Other threads will have to wait for the thread to finish executing before they can compete for the lock object. If this logic is very time-consuming, then the execution efficiency will become very low. So in order to optimize this situation, an Object object will be defined, and then the code that needs to be synchronized will be locked:

  private Object lockObject = new Object();
    @Override
    public void run() {
        for (int i= 0; i< 100000;i++){
            add();
        }
    }

    private  void add(){
        //。。。。其他耗时代码 开始
        //。。。。其他耗时代码 结束
        synchronized (lockObject){
            //需要同步的代码
            a ++;
        }
    }

In this case, the efficiency is higher. Since Object can be locked, any object can be locked. Let's figure out how an object's lock state information is recorded:

Because the object is in the heap memory area. So we can know the memory structure of the object:

The lock state information of the object is recorded in the object header.

The object header of the object records more information. In addition to the lock, it also records the information of the garbage collection mechanism. For example, it is the old generation, the young generation, and hashcode. The record lock status is determined by the system to decide whether to use 32-bit or 64-bit:

In general, there is a lock flag here. When two or three threads compete, it is a lightweight lock. The lock ID is recorded in the biased lock. EPOCH is a timestamp to determine whether it has timed out. If it times out, it becomes an unlocked state. No timeout means throwing in preemption. If there are too many, it becomes heavyweight.

The thread principle in Android concurrent programming

Guess you like