The essentials of C++ multi-threaded system programming of muduo library learning 02-thread safety of C/C++ system library

Original link: https://blog.csdn.net/qq_41453285/article/details/105047602

One, C/C++ thread library

  • The original C/C++ standard (C89/C99/C++03) does not involve threads. The new version of the C/C++ standard (C11 and C++11) stipulates the semantics of programs under multithreading . C++11 also Defines a thread library (std::thread)

Memory model

  • For the standard, the key is not to define the thread library, but to specify the memory model
  • In particular, it specifies when a thread's modification of a shared variable can be seen by other threads. This is called memory ordering or memory visibility.
  • Theoretically speaking, if there is no suitable memory model, writing the correct multi-threaded program belongs to the big operation behavior , see Hans-J. Boehm's paper "Threads Cannot be Implemented as a Library": http://www.hpl. hp.com/techreports/2004/HPL-2004-209.pdf. However, I don’t think we need to worry about the problems mentioned in this article. The lag of the standard will not affect practice . Because nearly 20 years have passed since the operating system started to support multithreading. People have written countless multithreaded programs that run in critical production environments. Even the Linux operating system kernel itself can be preemptive.
  • Therefore, it can be considered that the C/C++ compiler that comes with each operating system that supports multithreading is good enough for the multithreading support of this platform . Nowadays, the malfunction of multithreaded programs is hardly attributed to compiler bugs. After all, the POSIX threads standard was formulated in the mid-1990s. Of course, the positive significance of the new standard is to make it more secure to write cross-platform multi-threaded programs

Second, the impact of the emergence of the thread library on the standard library

  • The interface style of Unix system libraries (libc and system calls) was established in the early 1970s, and the first Unix operating system supporting user-mode threads appeared in the early 1990s

  • The emergence of threads immediately brought an impact to the system function library, destroying the traditional programming traditions and assumptions that have been used for 20 years.

    E.g:

    • **errno is no longer a global variable,** because each thread may execute different system library functions
    • **Some "pure functions" are not affected,** such as memset/strcpy/snprintf, etc.
    • **Some functions that affect the global state or have side effects can be locked to achieve thread safety, such as malloc/free, printf, fread/fseek, etc.
    • **Some functions that return or use static space may not be thread-safe, so other versions must be provided, such as asctime_r/ctime_r/gmtime_r, stderror_r, strtok_r, etc.
    • The traditional fork() concurrency model is no longer suitable for multithreaded programs (see the following "multithreading and fork()" article)
  • **Now Linux glibc defines errno as a macro. Note that errno is an lvalue. ** Therefore, it cannot be simply defined as the return value of a function, but must be defined as a dereference to the function's return pointer

img

Three, most system calls are safe

  • It’s worth mentioning that the operating system has supported multithreading for nearly 20 years.

    Some of the previous performance deficiencies have been basically made up

    . E.g:

    • The earliest SGI STL customized its own memory allocator, but now the STL that comes with g++ has directly used malloc to allocate memory, std::allocator has become a tasteless (see the following "C++ Experience on Don’t Overload Globals:: operator new() article")
    • The original Google tcmalloc has a great performance improvement over ptmalloc2 in glibc 2.3, and now the latest ptmalloc3 in glibc has greatly reduced the gap.
  • 我们不必担心系统调用的线程安全性, Because the system call is atomic for the user mode program . But it should be noted that the system call to the kernel state changes may affect other threads, this topic is left to the following "multithreading and IO" article to introduce

Blacklist of non-thread-safe functions

  • Contrary to intuition, the POSIX standard lists a blacklist of non-thread-safe functions , not a whitelist of thread-safe functions (All functions defined by this volume of POSIX.1-2008 shall be thread-safe , except that the following functions need not be thread-safe)
  • In this blacklist, functions such as system, getenv/putenv/setenv are not safe
  • Blacklist refer to: http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html#tag_15_09
  • Therefore, it can be said that most of the current glibc library functions are thread-safe . Especially the FILE* series of functions are safe, glibc even provides non-thread-safe versions (fread_unlocked, fwrite_unlocked, etc., see man unlocked_stdio) to meet the performance requirements of some special occasions

Fourth, the use of system calls is not thread-safe

线程安全是不可组合的

  • Although a single function is thread-safe, it is no longer safe to put two or more functions together

E.g

  • For example, fseek() and fread() are both safe, but the two operations of "seek and then read" for a certain file may be interrupted , and other threads may take the opportunity to modify the current position of the file and let the program Logic cannot be executed correctly
  • In this case, we can use the flockfile (FILE*) and funlockfile (FILE*) functions to explicitly lock . And because the FILE* lock is reentrant, calling fread() after locking will not cause deadlock
  • If the program directly uses the two system calls lseek and read to read files randomly, there is also a race condition of "seek first and then read", but it seems that we cannot efficiently lock system calls. The solution is to use the pread system call instead, **it will not change the current location of the file
  • It can be seen that one of the difficulties in writing thread-safe programs is that thread safety is not composable (just like C++ exception safety is not composable). A function foo() calls two thread-safe functions, and this The foo() function itself is probably not thread-safe. Even though most of the glibc library functions are thread-safe, we cannot write code like single-threaded programs.

Demo case

  • For example, in a single-threaded program, if we want to temporarily switch the time zone , we can use the tzset() function, which will change the global "current time zone" of the program

img

  • But in a multithreaded program, this is not thread-safe, even if tzset() itself is thread-safe. Because it changes the global state (current time zone), this may affect other threads to convert the current time, or be affected by other threads performing similar operations
  • The solution is to use the muduo::TimeZone class, **each immutable instance (immutable instance) corresponds to a time zone,** so that the time conversion does not need to modify the global state. E.g:

img

  • For the authors of the C/C++ library, how to design a thread-safe interface has also become a major test, and there are not many examples worthy of emulation. A basic idea is to design the class to be immutable as much as possible, so that you don’t have to worry about thread safety when you use it.

Five, the security of the standard library

  • Neither standard library containers nor strings are thread-safe
  • Most generic algorithms are thread-safe
  • iostream is not thread safe
  • Although the C++03 standard does not explicitly state the thread safety of the standard library, we:
    • *You can follow a basic principle: * All non-shared objects are independent of each other . If an object is used by only one thread from beginning to end, then it is safe**
    • **Another de facto standard is: ** The read-only operation of shared objects is safe (this means that the standard library container cannot use a self-adjusting data structure, such as splay tree, which will also be read when reading To modify the status, see http://www.cs.au.dk/~gerth/aa11/slides/selfadjusting.pdf), provided that there can be no concurrent write operations . For example, it is safe for two threads to access their own local vector objects; it is also safe to access shared const vector objects at the same time, but this vector cannot be modified by the third thread. Once there is a writer, then read-only operations must also be locked, such as vector::size()

Neither standard library containers nor strings are thread-safe

  • According to the definition of thread safety in the first article, neither the C++ standard library container nor std::string is thread safe. Only std::allocator is guaranteed to be thread safe.
  • There are two reasons:
    • One reason is to avoid unnecessary performance overhead
    • On the other hand, the reason is that the thread safety of a single member function is not composable.
  • Assuming safe_vectorclass, its interface is the same as std::vector, but each member function is thread-safe (similar to the Java synchronized method). But using safe_vector does not necessarily write thread-safe code. For example, after the if statement determines that vec is not empty, other threads may clear its elements, causing vec[0] to fail:

img

Most generic algorithms are thread-safe

  • Most of the generic algorithms in the C++ standard library are thread-safe (std::random_shuffle() may be an exception, which uses a random number generator), because these are stateless pure functions
  • 只要输入区间是线程安全的,那么泛型函数就是线程安全的

iostream is not thread safe

  • C++'s iostream is not thread-safe because of the following streaming output:

img

  • It is equivalent to two function calls:

img

  • Even if ostream::operator<<() is thread-safe, there is no guarantee that other threads will not output other characters to stdout before the two function calls
  • **For the requirement of "thread-safe stdout output", we can use printf instead to achieve safety and atomicity of output. But this is equivalent to using a global lock. Only one thread can call printf at any time, which may not be efficient . Efficient logging in a multithreaded program requires special design, please refer to the following "Efficient Multithreaded Logging" column

Guess you like

Origin blog.csdn.net/qq_22473333/article/details/113521786