thread-local static variable

Why have TLS? The reason is that the global variables in the process and the static variables defined in the function are shared variables that can be accessed by each thread. The memory content modified in one thread takes effect for all threads. This is an advantage and a disadvantage. Saying it's an advantage, the data exchange of threads becomes very fast. It is a disadvantage that if one thread dies, other threads will not survive; multiple threads access shared data, which requires expensive synchronization overhead and is prone to synchronization-related bugs.

  If you need variables that can be accessed by each function call within a thread but cannot be accessed by other threads (called static memory local to a thread thread-local static variables), you need a new mechanism to achieve. This is TLS.

  Thread local storage has different implementations on different platforms, and the portability is not very good. Fortunately, it is not difficult to implement thread local storage. The easiest way is to create a global table and query the corresponding data through the current thread ID. Because the IDs of each thread are different, the data found will naturally be different.

  Most platforms provide thread-local storage methods without requiring us to implement them ourselves:

  linux:

  int pthread_key_create(pthread_key_t *key, void (*destructor)(void*));

  int pthread_key_delete(pthread_key_t key);

  void *pthread_getspecific(pthread_key_t key);

  int pthread_setspecific(pthread_key_t key, const void *value);

  Win32

  Method 1: When each thread is created, the system allocates an array of LPVOID pointers (called a TLS array). This array is hidden from the perspective of C programming and cannot be accessed directly. It needs to be accessed through some C API function calls. First define some DWORD thread global variables or function static variables, ready to be used as index variables for each thread to access its own TLS array. When a thread uses TLS, the first step is to call the TlsAlloc() function within the thread to associate a TLS array index variable with a slot in the thread's TLS array, for example to obtain an index variable:

  global_dwTLSindex=TLSAlloc();

  Note that after this step, the current thread actually accesses the in-thread copy of the TLS array index variable. That is to say, although different threads appear to use the same TLS array index variable with the same name, in fact, each thread may get different DWORD values. The significance is that each thread using TLS obtains a thread-local static variable of type DWORD as the index variable of the TLS array. C/C++ originally did not have a mechanism to directly define thread-local static variables , so it took a lot of trouble.

  The second step is to dynamically allocate a memory area for the current thread (using the LocalAlloc() function call), and then put the pointer to this memory area into the corresponding slot of the TLS array (using the TlsValue() function call).

  In the third step, in any function of the current thread, you can use the TlsGetValue() function to get the pointer of the memory area in the previous step through the index variable of the TLS array, and then you can read and write the memory area. This implements variables that are accessible everywhere within this scope within a thread.

  Finally, if the above thread-local static variables are no longer needed, dynamically free the memory area (using the LocalFree() function), and then discard the corresponding slot from the TLS array (using the TlsFree() function).

 

 

TLS is a nice Win32 feature that makes multithreaded programming easier. TLS is a mechanism through which programs can have global variables, but in a "different per thread" state. In other words, all threads in a process can have global variables, but these variables are actually specific to a thread. For example, you might have a multithreaded program, where each thread writes to a different file (and therefore uses a different file handle). In this case, it would be very convenient to store the file handle used by each thread in TLS. When a thread needs to know the handle used, it can be obtained from TLS. The point is: the piece of code that the thread uses to get the file handle is the same in all cases, but the file handle fetched from TLS is different. Pretty neat, isn't it? The convenience of having global variables, but belonging to each thread.  
 

  While TLS is convenient, it is not without limitations. In Windows NT and Windows 95, there are 64 DWORD slots for each thread to use. This means that a process can have up to 64 DWORDs with different meanings for each thread. Although TLS can hold a single value such as a file handle, it is more commonly used to place pointers to thread-private data. There are many situations where a multithreaded program needs to store a bunch of data, all of which are associated with each thread. What many programmers do with this is to wrap these variables as C structs and store the struct pointers in TLS. When a new thread is born, the program allocates some memory for the structure and stores the pointer in the TLS reserved for the thread. Once the thread ends, the program code releases all allocated blocks. Since each thread has 64 slots for storing the thread's own data, where does this space come from? In the thread study, we can see from the structure TDB that each thread database has 64 DWORDs for TLS use . When you set or retrieve data with TLS functions, you are actually dealing with 64 DWORDs. Well, now we know that those "global variables with different meanings for each thread" are stored in the thread's respective TDB. 
 

    Next you may ask: how do I access these 64 DWORDS? How do I know which DWORDS are occupied and which ones are not? First of all, we need to understand the fact that the reason why the system provides us with the TLS function is to facilitate the realization of the function of "global variables with different meanings for each thread"; since the effect of "global variables" is to be achieved, then That is to say, each thread must use this variable. In this case, we do not need to mark the occupancy of the 64 DWORDS of each thread separately, because once one of the 64 DWORDS is occupied, it is all The thread's DWORD is occupied, so KERNEL32 uses two DWORDs (64 bits in total) to keep track of which slot is available and which slot has been used. These two DWORDs can be imagined as a 64-bit array, if a bit is set, it means that its corresponding TLS slot has been used. This array of 64-bit TLS slots is stored in the process database (we listed those two DWORDs in the PDB structure in the process section). 
 

The following four functions operate on TLS:  

(1)TlsAlloc  

We said above that KERNEL32 uses two DWORDs (64 bits in total) to keep track of which slot is available and which slot has been used. When you need to use a TLS slot, you can use this function to set the corresponding TLS slot to 1.  

(2) TlsSetValue  

TlsSetValue  can put data into the previously configured TLS slot. The two parameters are the TLS slot index value and the data content to be written. TlsSetValue  puts the data you specify into the appropriate position in an array of 64 DWORDs (located in the current thread database).  

(3)TlsGetValue  

This function is almost a mirror of TlsSetValue  , the biggest difference is that it fetches data instead of setting it. Like TlsSetValue  , this function first checks whether the TLS index value is valid or not. If so, TlsGetValue  uses this index value to find the corresponding data item in the array of 64 DWORDs (located in the thread database) and returns its contents.  

(4)TlsFree  

This function wipes out all the efforts of TlsAlloc  and TlsSetValue  . TlsFree  first checks whether the index value you give it is actually configured. If so, it will turn off the corresponding 64-bit TLS slots bit. Then, in order to avoid using the content that is no longer valid, TlsFree walks  each thread in the process and puts 0 on the TLS slot that was just freed. So, if a TLS index is later reconfigured, all threads using that index are guaranteed to get back a value of 0 unless they call TlsSetValue again .

 

 

Mutex is a very versatile kernel object. It can guarantee mutually exclusive access to the same shared resource by multiple threads. Similar to the critical section, only the thread that owns the mutex object has the right to access the resource. Since there is only one mutex object, it is determined that the shared resource will not be accessed by multiple threads at the same time under any circumstances. The thread currently occupying the resource should hand over the mutex object it owns after the task is processed, so that other threads can access the resource after obtaining it. Unlike several other kernel objects, mutex objects have special code in the operating system and are managed by the operating system. The operating system even allows it to perform some unconventional operations that other kernel objects cannot. For ease of understanding, please refer to the working model of the mutex kernel object given in Figure 3.8:

 

Figure 3.8 Protection of shared resources using mutex kernel objects

The arrow in figure (a) is the thread that wants to access the resource (rectangular box), but only the second thread owns the mutex (black dot) and has access to the shared resource, while other threads are excluded (see figure (b) shown). When the thread finishes processing the shared resource and is ready to leave the area, it will hand over the mutex object it owns (as shown in Figure (c)), and any other thread that tries to access the resource will have the opportunity to get the mutex object.

The functions that may be used to maintain thread synchronization with mutex kernel objects mainly include CreateMutex, OpenMutex, ReleaseMutex, WaitForSingleObject and WaitForMultipleObjects. Before using a mutex object, first create or open a mutex object through CreateMutex or OpenMutex. The prototype of the CreateMutex function is as follows:

HANDLE CreateMutex(

 LPSECURITY_ATTRIBUTES lpMutexAttributes, // security attribute pointer

 BOOL bInitialOwner, // initial owner

 LPCTSTR lpName // Mutex object name

);

The parameter bInitialOwner is mainly used to control the initial state of the mutex object. It is generally set to FALSE to indicate that the mutex object is not occupied by any thread when it is created. If the object name is specified when the mutex object is created, the handle of the mutex object can be obtained elsewhere in the process or through the OpenMutex function in other processes. The OpenMutex function prototype is:

HANDLE OpenMutex(
 DWORD dwDesiredAccess, // access flag
 BOOL bInheritHandle, // inheritance flag
 LPCTSTR lpName // mutex object name
);

When the thread that currently has access to the resource no longer needs to access the resource and wants to leave, it must release the mutex object it owns through the ReleaseMutex function. Its function prototype is:

BOOL ReleaseMutex(HANDLE hMutex);

Its only parameter hMutex is the handle of the mutex object to be released. As for the WaitForSingleObject and WaitForMultipleObjects waiting functions, the function of the mutex object to maintain thread synchronization is basically the same as that of other kernel objects, and it is also waiting for the notification of the mutex kernel object. But it needs to be specially pointed out here: when the mutex object notification causes the call to the wait function to return, the return value of the wait function is no longer the usual WAIT_OBJECT_0 (for the WaitForSingleObject function) or one between WAIT_OBJECT_0 and WAIT_OBJECT_0+nCount-1 value (for the WaitForMultipleObjects function), but will return a WAIT_ABANDONED_0 (for the WaitForSingleObject function) or a value between WAIT_ABANDONED_0 and WAIT_ABANDONED_0+nCount-1 (for the WaitForMultipleObjects function) to indicate the mutex that the thread is waiting for Owned by another thread that terminated before using the shared resource. In addition, the method of using the mutex object is different from the methods of using other kernel objects in the schedulability of the waiting thread. When other kernel objects are not notified, they will be called by the waiting function, and the thread will It will hang and lose schedulability at the same time, and the method using mutex can still have schedulability while waiting, which is one of the unconventional operations that mutex objects can accomplish.
  When writing programs, mutex objects are mostly used to protect memory blocks that are accessed by multiple threads, ensuring that any thread processing this memory block has reliable exclusive access to it. The sample code given below is the exclusive access protection of threads to the shared memory fast g_cArray[] through the mutex kernel object hMutex. Here is sample code:

// mutex object

HANDLE hMutex = NULL;

char g_cArray[10];

UINT ThreadProc1(LPVOID pParam)

{

 // wait for mutex notification

 WaitForSingleObject(hMutex, INFINITE);

 // write to the shared resource

 for (int i = 0; i < 10; i++)

 {

  g_cArray[i] = 'a';

  Sleep(1);

 }

 // release the mutex object

 ReleaseMutex(hMutex);

 return 0;

}

UINT ThreadProc2(LPVOID pParam)

{

 // wait for mutex notification

 WaitForSingleObject(hMutex, INFINITE);

 // write to the shared resource

 for (int i = 0; i < 10; i++)

 {

  g_cArray[10 - i - 1] = 'b';

  Sleep(1);

 }

 // release the mutex object

 ReleaseMutex(hMutex);

 return 0;

}

The use of threads makes program processing more flexible, and this flexibility also brings the possibility of various uncertainties. Especially when multiple threads have access to the same common variable. Although there may be no logical problems with program code without thread synchronization, in order to ensure the correct and reliable operation of the program, thread synchronization measures must be taken in appropriate occasions.

3.2.6 Thread Local Storage

Thread-local storage (TLS) is a convenient system for storing thread-local data. Using the TLS mechanism, several data can be associated with all threads in the process, and each thread can access the data associated with itself through the global index allocated by TLS. In this way, each thread can have thread-local static storage data.

The data structure used to manage TLS is very simple. Windows only maintains a bit array for each process in the system, and then applies for an array space of the same length for each thread in the process, as shown in Figure 3.9.

 

Figure 3.9 Data structures used internally by the TSL mechanism

Each process running in the system has an array of bits as shown in Figure 3.9. The member of the bit array is a flag, the value of each flag is set to FREE or INUSE, indicating whether the array index corresponding to this flag is in use. Windows guarantees that at least TLS_MINIMUM_AVAILABLE (defined in the WinNT.h file) flags are available.

The typical steps for using TLS dynamically are as follows.

(1) The main thread calls the TlsAlloc function to allocate an index for thread local storage. The function prototype is:

DWORD TlsAlloc(void); // returns a TLS index

As mentioned above, the system maintains a bit array of length TLS_MINIMUM_AVAILABLE for each process, and the return value of TlsAlloc is a subscript (index) of the array. The only purpose of this bit array is to remember which subscript is in use. In the initial state, the value of this bit array member is FREE, indicating that it is not used. When calling TlsAlloc, the system checks the values ​​of the members of the array one by one until it finds a member with a value of FREE. After changing the value of the found member from FREE to INUSE, the TlsAlloc function returns the index of the member. If a member with a value of FREE cannot be found, the TlsAlloc function returns TLS_OUT_OF_INDEXES (defined as -1 in the WinBase.h file), which means failure.

For example, when calling TlsAlloc for the first time, the system finds that the value of the first member in the bit array is FREE, it changes the value of this member to INUSE, and then returns 0.

When a thread is created, Windows allocates an array of length TLS_MINIMUM_AVAILABLE for the thread in the process address space, and the values ​​of the array members are initialized to 0. Internally, the system associates this array with the thread, guaranteeing that the data in this array can only be accessed within that thread. As shown in Figure 3.7, each thread has its own array, and the array members can store any data.

(2) Each thread calls TlsSetValue and TlsGetValue to set or read the value in the thread array. The function prototype is:

BOOL TlsSetValue(

DWORD dwTlsIndex, // TLS index

LPVOID lpTlsValue // value to set

);

LPVOID TlsGetValue(DWORD dwTlsIndex );       // TLS索引

The TlsSetValue function puts the value specified by the parameter lpTlsValue into the thread array member whose index is dwTlsIndex. In this way, the value of lpTlsValue is associated with the thread calling the TlsSetValue function. This function is called successfully and returns TRUE.

By calling the TlsSetValue function, a thread can only change the value of a member of its own thread array, and there is no way to set the TLS value for another thread. Until now, the only way to pass data from one thread to another was to use the parameters of the thread function when creating the thread.

The role of the TlsGetValue function is to obtain the value of the member whose index is dwTlsIndex in the thread array.

TlsSetValue and TlsGetValue are used to set and get the value of a specific member in the thread array respectively, and the index they use is the return value of the TlsAlloc function. This fully illustrates the relationship between the unique bit array in the process and the array of threads. For example, TlsAlloc returns 3, which means that index 3 is saved by each running and future thread in this process to access the value of the corresponding member of the respective thread array.

(3) The main thread calls TlsFree to release the local storage index. The only parameter to the function is the index returned by TlsAlloc.

Using TLS, it is possible to associate a piece of data with a specific thread. For example, the following example associates the creation time of each thread with the thread, so that the life cycle of the thread can be obtained when the thread terminates. The code for the entire example of tracking the running time of a thread is as follows:

#include <stdio.h> // under the 03UseTLS project

#include <windows.h>            

#include <process.h>

// Use TLS to track the running time of the thread

DWORD g_tlsUsedTime;

void InitStartTime();

DWORD GetUsedTime();

UINT __stdcall ThreadFunc(LPVOID)

{       int i;

         // Initialize start time

         InitStartTime ();

         // simulate long work

         i = 10000*10000;

         while(i--){}

         // print out the running time of this thread

         printf(" This thread is coming to end. Thread ID: %-5d, Used Time: %d \n",

                                                                                                       ::GetCurrentThreadId(), GetUsedTime());

         return 0;

}

int main(int argc, char* argv[])

{       UINT uId;

         int i;

         HANDLE h[10];

         // Initialize the thread runtime recording system by requesting an index in the process bit array

         g_tlsUsedTime = ::TlsAlloc();

         // Let ten threads run at the same time and wait for their respective output results

         for(i=0; i<10; i++)

         {       h[i] = (HANDLE)::_beginthreadex(NULL, 0, ThreadFunc, NULL, 0, &uId);         }

         for(i=0; i<10; i++)

         {       ::WaitForSingleObject(h[i], INFINITE);

                   ::CloseHandle(h[i]);      }

         // By releasing the thread local storage index, the resources occupied by the time recording system are released

         ::TlsFree(g_tlsUsedTime);

         return 0;

}

// Initialize the start time of the thread

void InitStartTime()

{ // Get the current time and associate the thread creation time with the thread object

         DWORD dwStart = ::GetTickCount();

         ::TlsSetValue(g_tlsUsedTime, (LPVOID)dwStart);

}

// Get the time a thread has been running

DWORD GetUsedTime()

{ // Get the current time, return the difference between the current time and the thread creation time

         DWORD dwElapsed = ::GetTickCount();

         dwElapsed = dwElapsed - (DWORD)::TlsGetValue(g_tlsUsedTime);

         return dwElapsed;

}

The GetTickCount function can get the time elapsed since Windows was started, and its return value is the time that has been started in milliseconds.

Under normal circumstances, the work of allocating TLS indexes for each thread should be completed in the main thread, and the assigned index value should be stored in a global variable for easy access by each thread. The example code above illustrates this very clearly. The main thread uses TlsAlloc to apply for an index for the time tracking system at the beginning, and saves it in the global variable g_tlsUsedTime. Afterwards, 10 threads were created simultaneously in order to demonstrate the characteristics of the TLS mechanism. These 10 threads finally print out their own life cycle, as shown in Figure 3.10.

 

3.10 The life cycle of each thread

This simple thread running time recording system only provides two functions, InitStartTime and GetUsedTime, for users to use. The InitStartTime function should be called at the beginning of the thread. After this function gets the current time, call TlsSetValue to save the thread creation time in the thread array indexed by g_tlsUsedTime. When you want to check the running time of the thread, just call the GetUsedTime function directly. This function uses TlsGetValue to get the thread's creation time, and then returns the difference between the current time and the creation time.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324909848&siteId=291194637