Jizhi Coding | OpenMP multi-threaded use

  Get into the habit of writing together! This is the 19th day of my participation in the "Nuggets Daily New Plan · June Update Challenge", click to view the details of the event .

欢迎关注我的公众号 [极智视界],获取我的更多笔记分享

  Hello everyone, I am Jizhi Vision. This article explains how to use OpenMP multithreading.

  OpenMP is a multi-threaded programming solution for shared memory parallel systems, which provides a high-level abstract description of parallel algorithms, and is especially suitable for parallel programming on multi-core CPU machines. The compiler automatically processes the program in parallel according to the pragma instructions added in the program. Using OpenMP reduces the difficulty and complexity of parallel programming. When the compiler does not support OpenMP, the program will degenerate into a normal (serial) program, and the existing OpenMP instructions in the program will not affect the normal compilation and operation of the program.

1 OpenMP common commands

  When using OpenMP, you need to import omp.hheader files, and then add parameters at compile time -fopenmp. In the part that requires parallel operation, use the #pragma ompdirective [clause] to tell the compiler how to execute the corresponding statement in parallel.

  Commonly used commands are as follows:

  • parallel: that is, there needs to be a code fragment #pragma omp parallelbehind it {}, which is enclosed in to indicate that it will be executed in parallel;
  • parallel for: here is followed by a forstatement , no additional code blocks are required;
  • sections;
  • parallel sections;
  • single: Indicates that only a single thread can be executed;
  • critical: critical area, indicating that only one openmp thread can enter at a time;
  • barrier: used for thread synchronization of code in the parallel domain, the thread stops when it reaches the barrier, and does not continue until all threads execute to the barrier;

  Commonly used clauses are as follows:

  • num_threads: specifies the number of threads in the parallel domain;
  • shared: Specify one or more variables as shared variables for multiple threads;
  • private: specifies that a variable or variables have a copy in each thread;  

  In addition, openmp also provides a series of API functions to obtain the status of parallel threads or control the behavior of parallel threads. Common APIs are as follows:

  • omp_in_parallel: determine whether it is currently in the parallel domain;
  • omp_get_thread_num: get the thread number;
  • omp_set_num_threads: Set the thread format in the parallel domain;
  • omp_get_num_threads: returns the number of threads in the parallel domain;
  • omp_get_dynamic: Determine whether to support dynamically changing the number of threads;
  • omp_get_max_threads: Get the maximum number of parallel threads available in the parallel domain;
  • omp_get_num_procs: returns the number of processors in the system;

2 OpenMP usage example

  Here is an example of calculating pi:

#include <stdio.h>
#include <omp.h>
 
#define MAX_THREADS 4
 
static long num_steps = 100000000;
double step;

int spmd(){
    int i,j;
    double pi, full_sum = 0.0;
    double start_time, run_time;
    double sum[MAX_THREADS];
    step = 1.0/(double) num_steps;
 
    for(j=1;j<=MAX_THREADS ;j++){
        omp_set_num_threads(j);
        full_sum = 0.0;
        start_time = omp_get_wtime();
      
#pragma omp parallel private(i)
      {
        int id = omp_get_thread_num();
        int numthreads = omp_get_num_threads();
        double x;
        double partial_sum = 0;
#pragma omp single
        printf(" num_threads = %d",numthreads);
        for (i=id;i< num_steps; i+=numthreads){
            x = (i+0.5)*step;
            partial_sum += + 4.0/(1.0+x*x);
        }
#pragma omp critical
        full_sum += partial_sum;
}
      
        pi = step * full_sum;
        run_time = omp_get_wtime() - start_time;
        printf("\n pi is %f in %f seconds %d threds \n ",pi,run_time,j);
    }
}

int openMP(){
    int i;
    double x, pi, sum = 0.0;
    double start_time, run_time;
 
    step = 1.0/(double) num_steps;
    for (i=1;i<=4;i++){
        sum = 0.0;
        omp_set_num_threads(i);
        start_time = omp_get_wtime();
#pragma omp parallel
      {
#pragma omp single
        printf(" num_threads = %d",omp_get_num_threads());
#pragma omp for reduction(+:sum)
        for (i=1;i<= num_steps; i++){
            x = (i-0.5)*step;
            sum = sum + 4.0/(1.0+x*x);
        }
      }
        pi = step * sum;
        run_time = omp_get_wtime() - start_time;
        printf("\n pi is %f in %f seconds and %d threads\n",pi,run_time,i);
    }
}  

int main() {
    spmd();
    printf("openMP Loop Paralelism:\n");
    openMP();
    return 0;
}

  The results are as follows:


  Well, the above shared the OpenMP multi-threading method. I hope my sharing can help you a little in your study.


 【Public number transmission】

"Extremely Intelligent AI | OpenMP Multi-Thread Use"


logo_show.gif

Guess you like

Origin juejin.im/post/7113887114059579400