PRAM model with Amdahl's Law

PRAM model

 

Parallel Computing parallel random access memory (PRAM, parallel random access machine) model is an idealized model of the parallel architecture systems, originally proposed by Fortune and Wyllie 1978.

 

PRAM model can be described as: p contains a processor, the same RAM, each with its own private memory, and shared a lot of total memory. In a unit of time, each processor can read a global or local memory address of the RAM to execute a single operation, and a write global or local memory address.

 

PRAM (Parallel Random Access Machine) model is a multi-stream single instruction stream data (SIMD) parallel machine has a shared memory model. It is assumed that there is an infinite large capacity shared memory, and a plurality of processors have the same function, at any time the processor can access the shared memory unit. The disadvantage is unrealistic, first of all infinite memory capacity does not exist, but due to various reasons, the global memory access is usually slower than expected. Secondly, he ignores the impact of communication bandwidth. Advantage of simple structure, easy to theoretical analysis.

 

PRAM 模型 is very successful as a basis for parallel algorithm design. The model ignores algorithmic complexity of machine connectivity and communication contention, data locality, synchronization, and reliability.

 

In particular the PRAM model is generally classified into four sub-categories which relate to the use of shared memory:

  • EREW : exclusive read, exclusive write
  • CREW : concurrent read, exclusive write
  • CRCW : concurrent read, concurrent write
  • ERCW : exclusive read, concurrent write (shown only for completeness)

An EREW PRAM does not allow simultaneous access to a memory location for read or for write operations.

A CREW PRAM allows simultaneous access for reading but not for writing.

A CRCW PRAM allows simultaneous access for reading and for writing.

 

 

Concurrent write operation, a further need to define how conflicts dividing process:

COMMON: a value can be written, if and only if all the (conflict) processor to write the same value (otherwise an error condition may flagged bE);
ARBITRARY: in dealing with conflict in a number of randomly selected so as to complete a write operation (of course this requires the algorithm will not go wrong at the time no matter which processor is selected);
PRIORITY: in each conflict the processor having the smallest processor identifier (lowest identifier) (i.e., having the highest priority) may perform a write operation;
cOMBINING: a function of values the conflicting iS written; Defining the combining the this operation Model the requires.

 

 

PRAM mutually different simulation models - Analog PRIORITY PRAM CRCW on EREW PRAM

 

An (n-processors is provided on the model) instructions in PRIORITY PRAM CRCW model, the model may be on EREW (there are n having the same processor) implemented only O (log n) time. (Assuming we can sort n numbers on an EREW PRAM with n processors in O (log n) time)

 

Proved : Let Q1, Q2, ..., Qn is PRIORITY PRAM CRCW model processor, wherein the content Qk Mk cell to be read (or want to write to the address). Now designated P1, P2, ..., Pn are the n analog processors EREW PRAM model. That we try to simulate a processor Pk Qk,. 1 <= K <= n-, global storage unit in a1 EREW PRAM, a2, ..., an are reserved for special purposes (normally used on certain information an array a [n] to refer to the n memory cells, i.e. with an element of the array a [K] to represent a K ).

 

for k = 1, ..., n, the following steps are performed in parallel:

        In EREW PRAM, Pk tuple set (Mk, K ), and stores it in the cell A K , i.e., A K ← (Mk, K ).

 

        (* Some other claims on the data:

                 If Qk wants to access Mk processor Pk writes pair (Mk ,k) into a[k].

                 If Qk does not wants to access any PRIORITY cell, processor Pk writes pair (0,k) int a[k].)

 

(Note that this step in EREW PRAM is a legal procedure, and execution time is O (1))

 

 

Now a1, a2, ..., an n bins and stored (Mk, with all of the processors (of the n) of the PRAM EREW k ) are sorted (according to LOCATION Memory Mk then according k ordered ), where. 1 <= K <= n-, time step based on the assumption that the beginning of our analysis takes is O (log n).

 

for k = 1, ..., n, the following steps are performed in parallel:

       (* Some other claims on the data: each Pk appends to the Cell A [k] f A Flag:

                                                   f = 1,如果the first component of a[k] ≠ the first component  of a[k-1]

                                                              or the first component of a[k] 0

                                                   f = 0,otherwise)

Now diad (or triad first two numbers) (Mk, K) can be organized into a number of blocks such that each block has the same tuple of a first component ( Mk , i.e., global address storage means); representing each block (that is, the front row in blocks of a) a second component having a minimum, which may be selected in O (1) time. (This section is given an example of a more easy to understand) Thus, in the EREW PRAM, the processor may Pk parallel to the "triple" unit specified read time O (1) or write operating.

PRIORITY WRITE: Each triplet reading Pk (Mk, K, F ) from Cell A [K] and Mk Writes INTO IFF F =. 1.

 

Finally, an example: You can see by sequencing, was divided into four-tuple, i.e., [(0, 7)], [(1,4)], [(2,1), (2, 3 ), (2, 6)], [(4,2), (4,5)]. Wherein the first component of each block are the same, it is the global address to be read or to be written, the second component is prioritized, also on behalf of the block row before the most, which has the highest priority.

 

 

And workload speedup

 

 

Now, let S be treated as a problem, the input size of n, S can be used p (n) processors on the PRAM, by a parallel algorithm in the t (n) steps.

This workload was then treated on the parallel algorithm with w (n) = t (n ) p (n) is represented. Any work w (n) PRAM algorithm execution can be transformed into a serial algorithm executed in the w (n) time, as long as an analog processor performs all the steps parallel to the PRAM.

Intuitively, when a predetermined program, a parallel execution typically means less time consuming, it may be accelerating factor (speedup factor) represented by:

 

Wherein, p is the number of processors, ts is the serial execution time, tp is the time in parallel. In practice, the time generally refers to wall clock time. But for PRAM algorithm, we said time is usually refers to the number of the steps of the algorithm.

 

Amdahl's Law (Amdahl's Law)

 

Amdahl's Law (Amdahl's Law) is a parallel computing to explain the basic rules of the limit that can be achieved, it is formulated by American computer scientist Gene Amdahl in 1967. Or, Amdahl's Law is the premise of a given problem size for prediction (or estimate) up to a maximum acceleration factor (also known as speedup) method.

 

If the serial execution time ts, f can not be parallelized in which the ratio of the portion occupied, then when the number of processors is p, the maximum speedup (the maximum speedup) is

Note that, the following expression above this (employed in Document 2) This is consistent with (as long as the molecule divided by the denominator to p):

Amdahl's Law shows the maximum acceleration in a system, based on the specific gravity of the serial and parallel assemblies each share, the computer program by obtaining additional resources can be theoretically obtained.

 


Recommended reading and references:

 

 

[1] Chen Guoliang, Design and Analysis (3rd Edition) parallel algorithms, Higher Education Press, 2009

 

 

[2] Dr Aaron Harwood, The University of Melbourne and the multicore parallel computing curriculum materials

[3] left to fly, Secret Code: Quest computer system from C / C ++ perspective, Electronic Industry Press

【4】http://pages.cs.wisc.edu/~tvrdik/2/html/Section2.html#Simulation2

 
发布了363 篇原创文章 · 获赞 4373 · 访问量 440万+

Guess you like

Origin blog.csdn.net/baimafujinji/article/details/6488199
law