Article directory

Linux process VS thread
- Shared by multiple threads of a process
The relationship between processes and threads

Linux process VS thread

Process is the basic unit of resource allocation.
Thread is the basic unit of OS scheduling.

Threads share process data, but also have their own portion of data:

Thread ID
A set of registers used to save the context data of each thread so that each thread can be scheduled reasonably.
Stack , the temporary variables generated by each thread pushing into and popping out of the stack must be saved in the private stack of each thread, so the stack is also private to each thread.
errno
signal mask word
Scheduling priority

Shared by multiple threads of a process

Because they are in the same address space, the so-called code segment and data segment are shared.

If you define a function, each thread can call it.
If you define a global variable, it can be accessed by multiple execution streams in a process.

In addition, each thread also shares the following resources and environment:

File descriptor (a file opened by a process can be seen and accessed by other threads.
Various signal processing methods. (SIG_IGN, SIG_DFL and other default signal processing functions or custom signal processing functions).
current working directory
User ID and group ID.

The relationship between processes and threads

The relationship between processes and threads, such as:

Insert image description here
Previously, we mainly focused on single-threaded process learning. In the future, we will also try to eliminate single-process and multi-threaded learning.

Thread creation pthread_create

The function to create a thread is pthread_create, and the prototype is as follows:

int pthread_create(pthread_t *thread, const pthread_attr_t *attr, void *(*start_routine) (void *), void *arg);

Parameter description :

thread: Get the successfully created thread ID. This parameter is an output parameter.
attr: used to set the attributes of the creation thread. Passing in NULL means using the default attributes.
start_routine: a function pointer whose return value and parameters are both void*. This parameter represents the thread routine, that is, the function to be executed after the thread is started.
arg: Parameter passed to the thread routine.

Return value description :

If the thread is successfully created, 0 is returned, and if it fails, an error code is returned.

Note :
Linux cannot really provide us with a thread interface, but Linux has a native thread library. To use this function, you must bring the -pthread option when compiling.

In the following example, we let the main thread create a new thread, and it is expected that the main thread and the new thread will execute the corresponding function code respectively.

void* Routine(void* arg)
{
    
    
	char* msg = (char*)arg;
	while (1){
    
    
		cout << " i am a thread 1 " << endl;
		sleep(1);
	}
}
int main()
{
    
    
	pthread_t tid;
	pthread_create(&tid, NULL, Routine, (void*)"thread 1");
	while (1){
    
    
	    cout << " I am a main thread " << endl;
		sleep(2);
	}
	return 0;
}

The results are as follows:
Insert image description here
Of course, we can also use the ps -ajx command to view the current process information. However, using this command only finds information related to the mythread process, and does not display other threads.

Therefore, we can use the ps -aL command to display thread information in the current process. Among them, LWP (Light Weight Process) represents the ID of the thread. You can see that the PIDs of the two threads are the same, which means that they belong to the same process.
Insert image description here
When we studied processes before, we thought that OS scheduling was based on PID. In fact, OS scheduling used PWD. However, the PWD and PID of the main thread are the same, so the PID and PWD are actually used when scheduling single-threaded processes. It's the same as above.

Get thread ID pthread_self

We can call the pthread_self function to obtain the thread PWD.

The function prototype is as follows :

pthread_t pthread_self(void);

In the following code, we use the pthread_self function to print the PID and PWD of the main thread and the new thread respectively.

void *threadRun( void *args )
{
    
    
    const string name = ( char * )args;
    
    int count = 0;

    while( count < 5 )
    {
    
    
        cout << name << " pid: " << getpid()  << " PWd "<< pthread_self()<<  endl;
            
        sleep(1);

        ++count;
    }
    return nullptr;
}
int main()
{
    
    
    pthread_t tid[5];

    char name[64];

    for ( long long i = 0; i < 5; ++i )
    {
    
    
        snprintf( name, sizeof name, "%s - %d", "thread", i );

        pthread_create( tid + i,NULL,threadRun, (void *)name );

        sleep(1);
    }
     
    cout << " i am a main thread " << " getpid: " << getpid() << " PWD " << pthread_self() << endl;
    return 0;
}

The result is as follows:
Insert image description here

Thread waits for pthread_join

First of all, we should note that when a thread is created, this thread, like a process, also needs to be waited for. If the main thread does not wait for the new thread process, the new thread resources will not be recycled. At this time, there is a pthread_join function that specifically handles new threads.

The function prototype is as follows :

int pthread_join(pthread_t thread, void **retval);

Parameter description :

thread: ID of the thread being waited for.
retal: The retval is a secondary pointer, and the primary pointer points to the return value of the thread.

Return value description :
The thread returns 0 if it waits successfully, and returns an error code if it fails.

If the thread thread returns through return, the unit pointed to by retal stores the return value of the thread thread function.
If the thread thread is terminated abnormally by calling pthread_ cancel by another thread, the unit pointed to by retal stores the constant PTHREAD_ CANCELED, and the constant value is -1.
If the thread thread is terminated by calling pthread_exit itself, the unit pointed to by retal stores the parameters passed to pthread_exiit.
If you are not interested in the termination status of the thread thread, you can pass NULL to the retal parameter.

For example, after the main thread of the following code creates a new thread, it blocks and waits for the new thread to print 10 times before exiting, and the main thread also exits.


void* threadRoutine( void* args )
{
    
    
    int i = 0;
    while( true )
    {
    
    
        cout << "新线程: " << ( char* )args << " running... " << endl;
        sleep(1);
        if( i++ == 10 ) break;
    }

    cout << "new thread quit... " << endl;

    return nullptr;
}

int main()
{
    
    
    pthread_t tid;
    
    pthread_create( &tid,nullptr,threadRoutine,(void*)"thread 1 ");

    pthread_join( tid,nullptr );
    
    cout<< " main thread wait done ... main quit " << endl;
}

The result is as follows:
Insert image description here

pthread_join second parameter

When the new thread exits, we can set a specific value for the new thread return value, but the value needs to be returned in the form of an address. When the new thread exits, it is saved by the ret pointer in the main thread. However, if you need to change the data saved by the first-level pointer, you need to pass in the second-level pointer (the address of ret) to obtain ret and then change it.

void* threadRoutine( void* args )
{
    
    
    int i = 0;
    while( true )
    {
    
    
        cout << "新线程: " << ( char* )args << " running... " << endl;
        sleep(1);
        if( i++ == 10 ) break;
    }

    cout << "new thread quit... " << endl;

    return (void*)10;
}

int main()
{
    
    
    pthread_t tid;
    
    pthread_create( &tid,nullptr,threadRoutine,(void*)"thread 1 ");

    void* ret = nullptr;

    pthread_join( tid,&ret );
    
    cout<< " main thread wait done ... main quit " << " exitcode: " <<  (long long )ret<<  endl;
}

The result is as follows:

Insert image description here
We know that the stack of each thread is private, but we can also obtain it through the second parameter of pthread_join, which further reflects the data transmission between the main and new threads.
For example: We created an array in the threadRoutine routine and returned it via the return value accepted by the ret pointer.

void* threadRoutine( void* args )
{
    
    
    int i = 0;
    int* data = new int[11];
    while( true )
    {
    
    
        cout << "新线程: " << ( char* )args << " running... " << endl;
        sleep(1);
        data[i] = i;
        if( i++ == 10 ) break;
    }

    cout << "new thread quit... " << endl;

    return (void*)data;
}

int main()
{
    
    
    pthread_t tid;
    
    pthread_create( &tid,nullptr,threadRoutine,(void*)"thread 1 ");

    int* ret = nullptr;

    pthread_join( tid,(void**)&ret );
    
    //cout<< " main thread wait done ... main quit " << " exitcode: " <<  endl;

    for( int i = 0; i < 10; i++  )
    {
    
    
        cout << ret[i] << endl;
    }
    return 0;
}

The result is as follows:
Insert image description here

Is there an exception in the thread? The entire process is also abnormal.

In the above code, we write a divide-by-zero error in the routine. When the thread crashes, the entire process will also crash. At this time, it is meaningless to obtain the exit code of the thread.

void* threadRoutine( void* args )
{
    
    
    int i = 0;
    int* data = new int[11];
    while( true )
    {
    
    
        cout << "新线程: " << ( char* )args << " running... " << endl;
        sleep(1);
        data[i] = i;
        if( i++ == 10 ) break;
         
        int a = 100;
        
        a /= 0;
        
    }

    cout << "new thread quit... " << endl;

    return (void*)data;
}

int main()
{
    
    
    pthread_t tid;
    
    pthread_create( &tid,nullptr,threadRoutine,(void*)"thread 1 ");

    int* ret = nullptr;

    pthread_join( tid,(void**)&ret );
    
    //cout<< " main thread wait done ... main quit " << " exitcode: " <<  endl;

    for( int i = 0; i < 10; i++  )
    {
    
    
        cout << ret[i] << endl;
    }
    return 0;
}

The result is as follows:
Insert image description here

Terminate thread

If you need to terminate only a certain thread instead of terminating the entire process, there are three methods:

Return from thread function.
The thread can terminate itself by calling the pthread_exit function.
A thread can call the pthread_cancel function to terminate another thread in the same process.

Terminate thread pthread_exit

The function of the pthread_exit function is to terminate the thread. The function prototype of the pthread_exit function is as follows:

void pthread_exit(void *retval);

Parameter description :
retval: Exit code information when the thread exits.

For example: We use the Pthread_exit function to terminate the process and set the exit code to 10.

void* threadRoutine( void* args )
{
    
    
    int i = 0;
    int* data = new int[11];
    while( true )
    {
    
    
        cout << "新线程: " << ( char* )args << " running... " << endl;
        sleep(1);
        data[i] = i;
        if( i++ == 10 ) break;
         
    }

    cout << "new thread quit... " << endl;

   pthread_exit((void*)10);
}

int main()
{
    
    
    pthread_t tid;
    
    pthread_create( &tid,nullptr,threadRoutine,(void*)"thread 1 ");

    int* ret = nullptr;

    pthread_join( tid,(void**)&ret );
    
    cout<< " main thread wait done ... main quit " << " exitcode: " << ( long long ) ret  <<  endl;
    
    return 0;
}

The result is as follows :
Insert image description here

Note :
The function of the exit function is to terminate the process. Any thread calling the exit function also means the termination of the entire process.

Terminate the process pthread_cancel

We can cancel a thread through the pthread_cancel function. The function prototype is as follows:

int pthread_cancel(pthread_t thread);

Parameter description :

thread: ID of the canceled thread.

Return value description :

If the thread is canceled successfully, 0 is returned, and if it fails, an error code is returned.

For example: We let the new thread execute for a period of time, and then the main thread calls the pthread_cancel function to cancel the new thread. We usually cancel the new thread by the main thread (this is the normal usage of pthread_cancel)


void* threadRoutine( void* args )
{
    
    
    int i = 0;
    int* data = new int[11];
    while( true )
    {
    
    
        cout << "新线程: " << ( char* )args << " running... " << endl;
        sleep(1);
        data[i] = i;
        if( i++ == 10 ) break;
         
    }

    cout << "new thread quit... " << endl;

   pthread_exit((void*)10);
}

int main()
{
    
    
    pthread_t tid;
    
    pthread_create( &tid,nullptr,threadRoutine,(void*)"thread 1 ");

    int count = 0;
    
    while( true )
    {
    
    
        cout << "main线程: " << "running..." << endl;
        sleep(1);
        count++;
        if( count >= 5 ) break;
    }
    pthread_cancel(tid);

    int* ret = nullptr;

    pthread_join( tid,(void**)&ret );
    
    cout<< " main thread wait done ... main quit " << " exitcode: " << ( long long ) ret  <<  endl;
    
    return 0;
}

The results are as follows :
We can see that the return value of the new thread at this time is no longer the 10 we originally set, because the new thread was canceled and terminated by the pthread_cancel function, and the OS default setting return value is -1.
Insert image description here

Process separation

By default, newly created threads are joinable. After the thread exits, it needs to perform a pthread_join operation, otherwise resources cannot be released, causing system leaks.
If you don't care about the return value of the thread, join is a burden. At this time, we can tell the system to automatically release the thread resources when the thread exits.

The pthread_detach function prototype is as follows:

int pthread_detach(pthread_t thread);

The target thread can be separated by other threads in the thread group, or the thread can be separated by itself, but under normal circumstances, we generally let the new thread separate by itself.

void* threadRoutine( void* args )
{
    
    
    pthread_detach(pthread_self());
    while( true )
    {
    
    
        cout << "新线程: " << ( char* )args << endl;
    
        sleep(1);
    }

    cout << "new thread quit... " << endl;

   pthread_exit((void*)10);
}

int main()
{
    
    
    pthread_t tid;
    
    pthread_create( &tid,nullptr,threadRoutine,(void*)"thread 1 ");

    int count = 0;
    
    while( true )
    {
    
    
        cout << " main 线程 " << endl;
        sleep(1);
        count++;
        if( count >= 5 ) break;
    }
    
    cout<< " main thread wait done ... main quit " <<  endl;
    
    return 0;
}

Note :
Joinable and detached conflict. A thread cannot be both joinable and detached. In a conventional thread separation scenario, the main thread is generally used to create new threads to process tasks and recycle resources, and is usually the last to exit. If the main thread exits first, it means that the process exits, and the new thread will exit immediately.

Thread ID and process address space layout

A thread ID is essentially an address

The pthread_read function will generate a thread ID, which is stored in the address pointed to by the first parameter, but the thread ID is different from the thread ID LWP mentioned earlier.
The thread ID mentioned earlier belongs to the category of process scheduling. Because a thread is a lightweight process and is the smallest unit of the operating system scheduler, a numerical value is needed to uniquely represent the thread.
The first parameter of the pthread_create function points to a virtual memory unit. The address of the memory unit is the thread ID of the newly created thread, which belongs to the category of the NPTL thread library. Subsequent operations of the thread library operate threads based on the thread ID.
The thread library NPTL provides the pthread_self function to obtain the thread's own ID.

When the process is running, the pthread shared library is loaded into physical memory, and then mapped to the shared area in the process address space according to the page table.

Insert image description here

Both the main thread and the new thread contain their own independent stack structures to keep each thread independent. The main thread uses a kernel-level stack structure, and each new thread contains a stack in the unique pthread library in the shared area. structure. In order to manage these attribute data, OS adopts the method of "describe first, then organize". The dynamic library contains struct pthread structures, which contain thread stack, context and other data, and the thread ID (tid) It is the first address of each struct pthread structure in the dynamic library, and then the CPU finds the corresponding thread through tid.

Insert image description here

Print thread ID

We can now print the thread ID.

void* threadRoutine( void* args )
{
    
    
    int i = 0;
    int* data = new int[11];
    while( true )
    {
    
    
        cout << "新线程: " << ( char* )args << " running... " << endl;
        sleep(1);
        data[i] = i;
        if( i++ == 10 ) break;
         
    }

    cout << "new thread quit... " << endl;

   pthread_exit((void*)10);
}

int main()
{
    
    
    pthread_t tid;
    
    pthread_create( &tid,nullptr,threadRoutine,(void*)"thread 1 ");

    printf( " %lu , %p \n ",tid,tid );

    int count = 0;
    
    while( true )
    {
    
    
        cout << "main线程: " << "running..." << endl;
        sleep(1);
        count++;
        if( count >= 5 ) break;
    }
    pthread_cancel(tid);

    int* ret = nullptr;

    pthread_join( tid,(void**)&ret );
    
    cout<< " main thread wait done ... main quit " << " exitcode: " << ( long long ) ret  <<  endl;
    
    return 0;
}

The results are as follows :
It can be seen that the thread ID is essentially an address.
Insert image description here

Thread local storage

We know that global variables, initialized data, uninitialized data, etc. are all shared between threads. However, we can add __pthread before the global variable to represent that each thread contains the unique global variable stored in each thread's local storage variable.

For example: We print the value and address of the global variable g_val through the main thread and the new thread respectively.

__thread  int g_val = 0;
void* threadRoutine( void* args )
{
    
    
    int i = 0;
    int* data = new int[11];
    while( true )
    {
    
    
        cout << "新线程: " << ( char* )args << " g_val: " << g_val <<  " &g_val "  << &g_val <<  endl;
        
        ++g_val;
              
        sleep(1);
    }

    cout << "new thread quit... " << endl;

   pthread_exit((void*)10);
}

int main()
{
    
    
    pthread_t tid;
    
    pthread_create( &tid,nullptr,threadRoutine,(void*)"thread 1 ");

    int count = 0;
    
    while( true )
    {
    
    
        cout << "main线程: " << " g_val " << g_val << " &g_val " << &g_val <<  endl;
        sleep(1);
        count++;
        if( count >= 5 ) break;
    }
    pthread_cancel(tid);

    int* ret = nullptr;

    pthread_join( tid,(void**)&ret );
    
    cout<< " main thread wait done ... main quit " << " exitcode: " << ( long long ) ret  <<  endl;
    
    return 0;
}

The results are as follows :
We can see that the main thread g_val value has not changed, but the new thread g_val increases by 1 every time it is printed, and the address of g_val in the main new thread is different.
Insert image description here

Linux Multithreading (Processes VS Threads | Thread Control)

Article directory

Linux process VS thread

Shared by multiple threads of a process

The relationship between processes and threads

Thread creation pthread_create

Get thread ID pthread_self

Thread waits for pthread_join

Terminate thread

Process separation

Thread ID and process address space layout

Guess you like