Java 中的 SafePoint

什么是 SafePoint

SafePoint 是 Java 代码中的一个线程可能暂停执行的位置。SafePoint 保存了在其他位置没有的一些运行时信息。SafePoint 保存了线程上下文中的任何东西,包括对象,指向对象或非对象的内部指针。

在 JVM 处于 SafePoint 时,可以做什么呢?

  1. Garbage collection pauses
  2. Code deoptimization
  3. Flushing code cache
  4. Class redefinition (e.g. hot swap or instrumentation)
  5. Biased lock revocation
  6. Various debug operation (e.g. deadlock check or stacktrace dump)

在 JVM 源码的注释里提到,当系统要进入 SafePoint 时,不同状态的 Java 线程的暂停机制是不一样的:

  1. 运行状态的线程必须阻塞。
  2. 不与 JVM 交互的运行 Native Code 的线程能继续执行(如果需要通过 JNI 访问 Java 对象,调用 JAVA 方法,从 Native 回到 JAVA 的话,则必须等到 Safepoint 结束)。
  3. 所有阻塞中的线程,需要等待 SafePoint 结束,才能返回。

Java 线程是如何进入 SafePoint 的呢?

  1. 每个线程通过检查 SafePoint 数据结构的状态来确定是否需要暂停自己来进入安全态。
  2. 对于编译的代码,JIT 通过在代码中适当的位置插入 SafePoint 检查的代码(通常在调用的返回和循环的退出)。
    • JIT 会在所有方法的返回之前,以及所有非 counted loop 的循环(无界循环)回跳之前放置一个 SafePoint,防止发生 GC 需要 STW 时,该线程一直不能暂停。
  3. 对于解释的代码,JVM 保存了两张字节码的执行表,如果需要进行 SafePoint 检查,JVM 会在两张表之间切换。

通常 safepoint 有三种状态:

  1. _not_synchronized:正常运行状态。
  2. _synchronizing:VM Thread 正在等待所有线程进入 safepoint。
  3. _synchronized:所有线程进入 safepoint,VM Thread 可以开始工作了。

程序运行时,可以设置 JVM 参数 -XX:+PrintSafepointStatistics –XX:PrintSafepointStatisticsCount=1 来输出 SafePoint 的统计信息。

源码分析

当 JVM 要让线程暂停 STW 时,会调用 SafepointSynchronize::begin 方法,该方法在 safepoint.cpp 里。源码地址

void SafepointSynchronize::begin() {
  Thread* myThread = Thread::current();
  assert(myThread->is_VM_thread(), "Only VM thread may execute a safepoint");
  
  if (PrintSafepointStatistics || PrintSafepointStatisticsTimeout > 0) {	// 输出统计信息
    _safepoint_begin_time = os::javaTimeNanos();
    _ts_of_current_safepoint = tty->time_stamp().seconds();
  }

safepoint.cpp 里有一段注释,描述了不同状态的线程是如何处理 safepoint 的:

  1. 针对执行字节码的线程,修改 dispatch table 来强制进行 safepoint 检查。在 begin 方法里是调用了 Interpreter::notice_safepoints() 方法来修改 dispatch table 的。
  2. 针对 native code,当它要返回 Java Code 时,就需要检查 safepoint,判断是否需要阻塞。
  3. 执行 compiled Code 的线程,会定期检查 polling page,如果不可读了,就阻塞。
  4. 已经处于阻塞状态的线程,必须等到 safepoint 操作技术,才能从 block condition 返回。
  5. 处于转换状态的线程,也会去检查 safepoint 状态,阻塞自己。
  // Begin the process of bringing the system to a safepoint.
  // Java threads can be in several different states and are
  // stopped by different mechanisms:
  //
  //  1. Running interpreted
  //     The interpeter dispatch table is changed to force it to
  //     check for a safepoint condition between bytecodes.
  //  2. Running in native code
  //     When returning from the native code, a Java thread must check
  //     the safepoint _state to see if we must block.  If the
  //     VM thread sees a Java thread in native, it does
  //     not wait for this thread to block.  The order of the memory
  //     writes and reads of both the safepoint state and the Java
  //     threads state is critical.  In order to guarantee that the
  //     memory writes are serialized with respect to each other,
  //     the VM thread issues a memory barrier instruction
  //     (on MP systems).  In order to avoid the overhead of issuing
  //     a memory barrier for each Java thread making native calls, each Java
  //     thread performs a write to a single memory page after changing
  //     the thread state.  The VM thread performs a sequence of
  //     mprotect OS calls which forces all previous writes from all
  //     Java threads to be serialized.  This is done in the
  //     os::serialize_thread_states() call.  This has proven to be
  //     much more efficient than executing a membar instruction
  //     on every call to native code.
  //  3. Running compiled Code
  //     Compiled code reads a global (Safepoint Polling) page that
  //     is set to fault if we are trying to get to a safepoint.
  //  4. Blocked
  //     A thread which is blocked will not be allowed to return from the
  //     block condition until the safepoint operation is complete.
  //  5. In VM or Transitioning between states
  //     If a Java thread is currently running in the VM or transitioning
  //     between states, the safepointing code will wait for the thread to
  //     block itself when it attempts transitions to a new state.

对于执行字节码的线程,是通过替换 dispatch table 来使其进入 safepoint 状态的,dispatch table 是 JVM 用来记录方法地址进行跳转的。Java 里有三个 DispatchTable,分别是:

  1. _active_table:当前正在使用的 dispatch table。
  2. _normal_table:正常运行所使用的 dispatch table。
  3. _safept_table:safepoint 状态下使用的 dispatch table。

修改 dispatch table 的代码如下所示,在进入 safepoint 时,调用了 notice_safepoints 方法,将 _active_table 置为了 _safept_table。相对的,还有一个 ignore_safepoints 方法,是在退出 safepoint 时调用的,该方法将 _normal_table 赋值给了 _active_table。

DispatchTable TemplateInterpreter::_active_table;
DispatchTable TemplateInterpreter::_normal_table;
DispatchTable TemplateInterpreter::_safept_table;

void TemplateInterpreter::notice_safepoints() {
  if (!_notice_safepoints) {
    // switch to safepoint dispatch table
    _notice_safepoints = true;
    copy_table((address*)&_safept_table, (address*)&_active_table, sizeof(_active_table) / sizeof(address));
  }
}

void TemplateInterpreter::ignore_safepoints() {
  if (_notice_safepoints) {
    if (!JvmtiExport::should_post_single_step()) {
      // switch to normal dispatch table
      _notice_safepoints = false;
      copy_table((address*)&_normal_table, (address*)&_active_table, sizeof(_active_table) / sizeof(address));
    }
  }
}

对于正在执行 native code 的线程,VM Thread 不需要等待其执行完成,当该线程返回 Java 代码时,会取检查 safepoint 状态的。检查代码位于 thread.cpp 的 check_safepoint_and_suspend_for_native_trans 方法里,该方法调用了 SafepointSynchronize::do_call_back 方法判断,如果当前状态不是 _not_synchronized,则 block。

// int thread.cpp
void JavaThread::check_safepoint_and_suspend_for_native_trans(JavaThread *thread) {
  if (SafepointSynchronize::do_call_back()) {
    // If we are safepointing, then block the caller which may not be
    // the same as the target thread (see above).
    SafepointSynchronize::block(curJT);
  }
}

// in safepoint.cpp
  inline static bool do_call_back() {
    return (_state != _not_synchronized);
  }

对于执行 JIT 编译代码的线程,是通过检查 polling page 是否可读来判断是否进入 safepoint 的。在 SafepointSynchronize::begin 方法里,调用了 make_polling_page_unreadable 方法,该方法最终是通过 mprotect 方法来修改 polling page 的访问权限的。

// in safepoint.cpp
void SafepointSynchronize::begin() {
      if (UseCompilerSafepoints && int(iterations) == DeferPollingPageLoopCount) {
         guarantee (PageArmed == 0, "invariant") ;
         PageArmed = 1 ;
         os::make_polling_page_unreadable();
      }

// in os_bsd.cpp
void os::make_polling_page_unreadable(void) {
  if( !guard_memory((char*)_polling_page, Bsd::page_size()) )
    fatal("Could not disable polling page");
}

bool os::guard_memory(char* addr, size_t size) {
  return bsd_mprotect(addr, size, PROT_NONE);
}

static bool bsd_mprotect(char* addr, size_t size, int prot) {
  char* bottom = (char*)align_size_down((intptr_t)addr, os::Bsd::page_size());
  assert(addr == bottom, "sanity check");

  size = align_size_up(pointer_delta(addr, bottom, 1) + size, os::Bsd::page_size());
  return ::mprotect(bottom, size, prot) == 0;
}

当编译好的程序去访问可不读的 polling page 时,会产生一个错误信号 SIGSEGV,经过处理后最终会调用 SafepointSynchronize::handle_polling_page_exception 方法,该方法最终调用了 SafepointSynchronize::block 方法来阻塞线程。

资料

  1. JVM 中的 SafePoint
  2. Safepoint 及 GC 的触发条件
  3. Java开源工具在linux上的源码分析(四):safe point
发布了232 篇原创文章 · 获赞 347 · 访问量 79万+

猜你喜欢

转载自blog.csdn.net/hustspy1990/article/details/92217432