barrier synchronization barrier synchronization

barrier synchronization for this scenario: Before performing a task that must be completed N tasks, generally composed of N threads to complete each task.

Correlation function:
int pthread_barrier_init (pthread_barrier_t * Barrier,
                         const pthread_barrierattr_t * the restrict attr,
                         unsigned COUNT);
COUNT parameter must be greater than 0, specify the number you want to synchronize threads: Only when all threads are executed pthread_barrier_wait later, they can return from pthread_barrier_wait.

pthread_barrier_wait: current thread synchronization, to synchronize the barrier at the object. When the barrier is performed at a number of threads of pthread_barrier_wait reaches a preset value, the thread obtained PTHREAD_BARRIER_SERIAL_THREAD return value, the return value of 0 to give other threads. barrier object will be reset to the last init state.


Barriers


Some parallel computations need to "meet up" at certain points before continuing. This can, of course, be accomplished with semaphores, but another construct is often more convenient: the barrier (the pthreads library pthread_barrier_t). As a motivating example, take this program:


#define _XOPEN_SOURCE 600


#include <pthread.h>
#include <stdlib.h>
#include <stdio.h>




#define ROWS 10000
#define COLS 10000
#define THREADS 10


double initial_matrix[ROWS][COLS];
double final_matrix[ROWS][COLS];
// Barrier variable
pthread_barrier_t barr;


extern void DotProduct(int row, int col,
                       double source[ROWS][COLS],
                       double destination[ROWS][COLS]);
extern double determinant(double matrix[ROWS][COLS]);


void * entry_point(void *arg)
{
    int rank = (int)arg;
    for(int row = rank * ROWS / THREADS; row < (rank + 1) * THREADS; ++row)
        for(int col = 0; col < COLS; ++col)
            DotProduct(row, col, initial_matrix, final_matrix);


    // Synchronization point
    int rc = pthread_barrier_wait(&barr);
    if(rc != 0 && rc != PTHREAD_BARRIER_SERIAL_THREAD)
    {
        printf("Could not wait on barrier\n");
        exit(-1);
    }


    for(int row = rank * ROWS / THREADS; row < (rank + 1) * THREADS; ++row)
        for(int col = 0; col < COLS; ++col)
            DotProduct(row, col, final_matrix, initial_matrix);
}


int main(int argc, char **argv)
{
    pthread_t thr[THREADS];


    // Barrier initialization
    if(pthread_barrier_init(&barr, NULL, THREADS))
    {
        printf("Could not create a barrier\n");
        return -1;
    }


    for(int i = 0; i < THREADS; ++i)
    {
        if(pthread_create(&thr[i], NULL, &entry_point, (void*)i))
        {
            printf("Could not create thread %d\n", i);
            return -1;
        }
    }


    for(int i = 0; i < THREADS; ++i)
    {
        if(pthread_join(thr[i], NULL))
        {
            printf("Could not join thread %d\n", i);
            return -1;
        }
    }


    double det = Determinant(initial_matrix);
    printf("The determinant of M^4 = %f\n", det);


    return 0;
}
This program spawns a number of threads, assigning each to compute part of a matrix multiplication. Each thread then uses the result of that computation in the next phase: another matrix multiplication.


There are a few things to note here:


The barrier declaration at the top
The barrier initialization in main
The point where each thread waits for its peers to finish.


原文链接:http://pages.cs.wisc.edu/~travitch/pthreads_primer.html

发布了9 篇原创文章 · 获赞 4 · 访问量 1万+

Guess you like

Origin blog.csdn.net/juan190755422/article/details/41748463