Architecture 10_ Vector Processing Machine

1. What is a vector processor?

Having a vector data representation and the corresponding vector instruction pipeline processing machine called vector pipeline processor , also called vector processors .

Corresponding to it is a scalar processor, which does not support vector data representation and does not provide vector instructions.

2. Example: a simple FORTRAN cycle program

      DO 10 i=1,N

10  d[i] = a[i]*(b[i] + c[i])

(1) Horizontal (horizontal) processing method

Calculate each element of vectors k and d in turn

.....  ......

ki = bi + ci

di = ki * who

..... .......

In each iteration of the loop, there is 1 data correlation, 1 control correlation, and two function switches are required

(2) Vertical (vertical) processing method

   After calculating each element of vector k, then calculating each element of vector d requires the support of vector data type and vector instruction

   K = B + C

   D = K * A

   No branch; only 1 data correlation; only 1 function switch

   Requires memory-memory operation pipeline

(3) Grouping (vertical and horizontal) processing method

    Divide the vector of length N into m groups, each group has n elements, and the groups are processed vertically, and each group is processed in turn

    It needs m iterations; each iteration executes two vector instructions, there is 1 data correlation, and 2 function switching is required

    Operation pipeline that requires register-register type operations

    This technique is called vector loop or segment mining

3. Examples:

(1)

(2)

4. Speed ​​evaluation method of vector processor

  Since an instruction can get at most one result, a scalar processor usually uses the number of instructions per second (MIPS) to measure the computing speed of the machine

  The vector processor uses the number of floating point calculation results per second to measure the speed of the machine, with MFLOPS as the unit of measurement

  Using MFLOPS can ignore the impact of load, store, branch, test and other types of instructions

5. Example analysis of vector processing machine

 Example: Cray-I

  1. Performance indicators

          1GFLOPS, frequency 80M, vector length 64

  2. Basic structure

       -Vector computing components

       -Vector register group (V0-V7)

       -Vector length register

       -Vector mask register

3. Vector instruction type

  V stands for vector S stands for scalar

Functional component conflict: the same functional component is used by more than one parallel work vector instruction

Vi conflict: each vector instruction working in parallel has the same source vector or result vector

4. CRAY-I system structure characteristics

   4.1 Connection path between vector register and functional unit

  Each Vi block has a separate bus that can be connected to all vector function units, and each vector function unit also has its own bus that sends the operation result back to the vector register group.

   4.2 Vector link technology

The connection process that occurs when    the result of one vector function unit is directly sent to the operand register of another vector function unit is called linking .

   When two instructions are related to "read after write", if they do not have conflicts between functional components and vector registers (source or destination), it is possible to connect the functional components they use end to end to form a link pipeline. Flow water treatment .

The link feature is essentially the result of introducing the idea of ​​pipeline "orientation" into the vector execution process.

6. Problems that should be considered in vector link technology

    Set appropriate vector function components and operand registers

    Link timing issues

       -Only in which clock cycle the first result element of the previous vector instruction is sent to the result vector register can it be linked

       -Only the current vector instruction has been executed completely and the corresponding vector register resource is released before the subsequent vector instructions can be executed.

       -The vector length of all vector instructions that can be chained and executed should be equal

 

 

 

Guess you like

Origin blog.csdn.net/weixin_42596333/article/details/104200986