std::execution::seq/par/par_unseq 的

C++ 17 对 <algorithm> 头中的函数加入了 execution policy（执行策略，见 <execution> 头），例如：

template< class ExecutionPolicy, class InputIt, class UnaryFunction2 >
void for_each( ExecutionPolicy&& policy, InputIt first, InputIt last, UnaryFunction2 f );

可以指定以下执行策略中的一种

std::execution::seq ：execution sequentially 顺序执行，与不使用执行策略时一样，只使用一个线程
std::execution::par ：execution in parallel 并行执行，允许使用多个线程来执行，要求保证执行体中不能有 data races（数据竞争）
std::execution::par_unseq ：execution in parallel or unsequentially 并行执行或非顺序执行，除了允许使用多个线程执行外，还允许在单独一个线程中交织（interleave）独立的循环迭代，要求执行体还必须是 vectorization-safe 向量化安全的

What is the difference between `par` and `par_unseq`?

par_unseq requires stronger guarantees than par, but allows additional optimizations. Specifically, par_unseq requires the option to interleave the execution of multiple function calls in the same thread.

Let us illustrate the difference with an example. Suppose you want to parallelize this loop:

std::vector<int> v = { 1, 2, 3 };
int sum = 0;
std::for_each(std::execution::seq, std::begin(v), std::end(v), [&](int i) {
  sum += i*i;
});

You cannot directly parallelize the code above, as it would introduce a data dependency for the sum variable. To avoid that, you can introduce a lock:

int sum = 0;
std::mutex m;
std::for_each(std::execution::par, std::begin(v), std::end(v), [&](int i) {
  std::lock_guard<std::mutex> lock{m};
  sum += i*i;
});

Now all function calls can be safely executed in parallel, and the code will not break when you switch to par. But what would happen if you use par_unseq instead, where one thread could potentially execute multiple function calls not in sequence but concurrently?

It can result in a deadlock, for instance, if the code is reordered like this:

 m.lock();    // iteration 1 (constructor of std::lock_guard)
 m.lock();    // iteration 2
 sum += ...;  // iteration 1
 sum += ...;  // iteration 2
 m.unlock();  // iteration 1 (destructor of std::lock_guard)
 m.unlock();  // iteration 2

In the standard, the term is vectorization-unsafe. To quote from P0024R2:

A standard library function is vectorization-unsafe if it is specified to synchronize with another function invocation, or another function invocation is specified to synchronize with it, and if it is not a memory allocation or deallocation function. Vectorization-unsafe standard library functions may not be invoked by user code called from parallel_vector_execution_policy algorithms.

One way to make the code above vectorization-safe, is to replace the mutex by an atomic:

std::atomic<int> sum{0};
std::for_each(std::execution::par_unseq, std::begin(v), std::end(v), [&](int i) {
  sum.fetch_add(i*i, std::memory_order_relaxed);
});

来源：https://stackoverflow.com/questions/39954678/difference-between-execution-policies-and-when-to-use-them