MPI 概念、原理和编程

Communication functions that involves all the process in a communicator are called collective communications. To distinguish between collective communications and functions such as MPI_Send and MPI_Recv, MPI_Send and MPI Recv are often called point-to-point communications.

Collective Communications:

All the processes in the communicator must call the same collective function.
The arguments passed by each process to an MPI collective communication must be “compatible”.
The output_data_p argument is only used on dest_process.
4 Point-to-point communications are matched on the basis of tags and communicators.

3.4.4 MPI_All reduce

MPI_Allreduce is identical to that for MPI_Reduce except that there is no dest_process since all the process should get the result.

3.4.5
A collective communication in which data belonging to a single process is sent to all of the process in the communicator is called a broadcast, and you’ve probably guessed that MPI provides a broadcast function.

3.4.7 Scatter
If the communicator comm contains comm_sz processes, the MPI_Scatter divides the data referenced by send_buf_p into comm_sz pieces–the first piece goes to process 0, the second to process 1, and so on.

3.4.8 Gather

We need to write a function for printing out a distributed vector. Our function can collect all of the components of the vector onto process 0, and then process 0 can print all of the components.

int MPI_Gather(
void*   send_buf_p,
int     send_count,
MPI_Datatype    send_type,
void*   recv_buf_p,
int     recv_count,
MPI_Datatype    recv_type,
int     dest_proc,
MPI_Comm    comm);

3.5 MPI derived Datatypes

In virtually all distributed-memory systems, communication can be much more expensive then local computation. The cost of sending a fixed amount of data in multiple message is usually much greater than the cost of sending a single message with the same amount of data.

In MPI, the derived datatype can be used to represent any collection of data items in memory by storing both the types of the items and their relative locations in memory.

3.6 Performance Evaluation of MPI Programs

MPI provides a function, MPI_Wtime, that returns the number of seconds hat have elapsed since some time in the past:

double MPI_Wtime(void)

Both MPI_Wtime and GET_TIME of “timer.h” return wall clock time.

The MPI collective communication function MPI_Barrier insures that no process will return from calling it until every process in the communicator has started calling it.
The following code can be used to time a block if MPI code and report a single elapsed time.

扫描二维码关注公众号，回复： 2575542 查看本文章

double local_start, local_finish, local_elapsed, elapsed;
MPI_Barrier(comm);
local_start = MPI_Wtime();
/* Code here */
local_finish = MPI_Wtime();
local_elapsed = local_finish - local_start;
MPI_Reduce(&local_elapsed, &elapsed, 1，MPI_DOUBLE, MPI_MAX, 0, comm);

if(My_rank == 0)
   printf("Elapsed time = %e seconds\n",elapsed);

3.6.2 Results

T p a r a l l e l (n, p) = T s e r i a l (n) / P + T o v e r h e a d

$T_{parallel}(n,p)=T_{serial}(n)/P + T_{overhead}$
Speedup is the ratio of the serial run-time to the parallel run-time:

S (n, p) = T s e r i a l ( n ) T p a r a l l l e l ( n , p )

$S(n,p) = \frac{T_{serial}(n)}{T_{paralllel}(n,p)}$
Parallel efficiency is “per process” speedup:

E (n, p) = S ( n , p ) p

$E(n,p) = \frac{S(n,p)}{p}$

3.6.4 Scalability

Very roughly speaking, a program is scalable if the problem size can be increased at a rate so that the efficiency doesn’t decrease as the number of processes increase.

Programs that can maintain a constant efficiency without increasing the problem size are sometimes said to be strongly scalable.

Programs that can maintain a constant efficiency if the problem size increases at the same rate as the number of processes are sometimes said to be weakly scalable.