UPC quickly learn the notes Collective Operations

This article was previously published without the author's consent is prohibited any form of copy and reprint.

Learn some collective manipulation functions, they are defined in the header file <upc_collective.h> in

The first category: relocation operation

 

 The second category: calculating operation

 

The basic concept definition:

1. Collective :( Google translation)

 Language requirements for certain operations, this requirement will limit operations such as call must match between all threads. Unless all threads perform collective operations in the same order, otherwise the operation of collective behavior is uncertain.

2. SINGLE-valued : a single value (Google translation)

Collective operation is an operand, has the same value on each thread. Otherwise, the behavior of this operation is undefined.

Some rules and requirements :( Google translation)

1. The following requirements apply to all the functions defined in this document, their names are beginning upc all.

2. All functions are collective

3. All parameters are a set of single-valued functions.

4. not call functions between the set and the respective upc notify upc wait.

The last parameter is a function of each set of the type of a variable upc_flag_t sync_mode.

 Value of the synchronization pattern is formed by the constant and UPC_OUT_YSYNC UPC_IN_XSYNC form of a constant or together (wherein X and Y can be NO, MY, or ALL).

If there is sync_mode value (UPC_IN_XSYNC | UPC_OUT_YSYNC), then if X is:

NO: When a thread enters the first set of function calls, set functions may begin to read or write data

MY: Aggregate functions may start only read or write and has entered the data associated with the thread of collective function call

ALL: only after all threads have entered the collective function calls, collective function can begin to read or write data

If Y is:

NO: collective function can read and write data until the group returns from the function call until the last thread,

MY: Only the collective function call can only be returned in a thread after the completion of all the data associated with the write thread

ALL: only after completion of all data read and write, a set of function calls should be returned.

6. In addition to the functions defined in this specification, all functions including a map that roughly illustrates how data blocks are copied from one thread to another.

Shows labeled T0, T1, T2 and T3 four threads, and an appropriate number of blocks of data Di is marked in each figure.

These figures intended to represent all general related functions.
Unless otherwise stated, these figures do not correspond to the sample code segments.
Code segments figures and examples should not be considered part of the formal specification.

4.2.1 upc_all_broadcast function

 

 

1. upc_all_broadcast copy function associated with the memory block to a single thread in the shared memory blocks each thread.

 

 The number of bytes in each block is nbytes.

If copying between overlapping objects, the behavior is undefined.

2. nbytes must be strictly greater than 0

3. upc_all_broadcast function ????? (slightly)

(Save time, directly on the map, after the discharge function only prototype, sample images, Precautions)

 

 4.2.2 upc_all_scatter function

 

 Description: upc_all_scatter function to copy a block of shared memory and the i-th threads associated with a single thread in the i-th block of the shared memory area associated. The number of bytes in each block is nbytes.

If copying between overlapping objects, the behavior is undefined.

 

 4.2.3 upc_all_gather function

 

 

 Description: upc_all_collect copy function block i to the shared memory region associated with a single thread with the i-th block of the shared memory associated with the thread.

 

4.2.4 upc_all_gather_all function

 

 

 

 

4.2.5 upc_all_exchange function

 

 

 

 4.2.6 upc_all_permute function

 

 

Computing operation

upc_op_t type variable can have the following values:

UPC_ADD added.

UPC_MULT multiplication.
UPC_AND bit and, for integer and character variables. The results of floating-point uncertain.
UPC_OR character and integer variables or bit. The results of floating-point uncertain.

UPC_XOR用于整数和字符变量的按位XOR。 浮点数的结果不确定。
UPC_LOGAND所有变量类型的逻辑与。
UPC_LOGOR所有变量类型的逻辑或。
UPC_MIN对于所有数据类型,找到最小值。
UPC_MAX对于所有数据类型,请找到最大值。
UPC_FUNC在每个步骤中,使用指定的交换函数func对src数组中的数据进行操作。
UPC_NONCOMM FUNC在每个步骤中,使用指定的非交换函数func对src数组中的数据进行操作。

 4.3.2 upc_all_reduce 函数

描述:The function upc_all_reduce performs a user specified operation, such as upc_add, on the all the elements treats and returns the value to a single thread

 

 

 

 4.3.3 upc_all_prefix_reduce 函数

 

 

 

 

 4.3.4 upc_all_sort函数

 

描述: upc all sort函数采用大小为elem size个字节的nelems个元素的共享数组A,并使用func函数比较它们按升序对它们进行排序。

The function upc_all_sort takes a shared array A of nelems elements of size elem_size bytes each and sorts them in place in ascending order using the function func to compare elements.

 

 

所有内容均来自《UPC Collective Operations SpecificationsV1.0》。这些文章是为了期中考试临时抱佛脚的结果···若有谬误,敬请指教。

All contents in this article are from <UPC Collective Operations SpecificationsV1.0>, if this has violated your copyright, contact me immediately at [email protected]. Thank you.

Guess you like

Origin www.cnblogs.com/mrlonely2018/p/11992916.html