Overview of Concurrent Programming and Parallel Programming (Parallel Framework)

task

Asynchronous programming (async&await)

An overview of concurrent programming

foreword

To be honest, in the first two years of my software development, I hardly considered concurrent programming. Request and response can complete the business logic as soon as possible. The task of a week can be completed in two days and will not be delayed for three days (the rest of the time is all kinds of waste), not at all. Consider performance issues (within acceptable range). But with the change of work content, some problems, its solution has made me unable to avoid the knowledge point of concurrent programming. In order to get it done once and for all, this series of articles related to concurrent programming was born, hoping to be helpful to everyone.

basic terms

  • Synchronization: Concerning coordinating activity between threads or processes and ensuring that data accessed by multiple threads or processes remains valid, synchronization allows threads and processes to operate in unison.
  • Concurrency: Doing multiple things at the same time is about the cooperation and serial work of various aspects of a program to achieve a goal.
  • Process: A running activity of a program with certain independent functions on a certain data set. It is the basic unit for the system to allocate resources and schedule operations. When a program starts running, it starts one or more programs in the system. A process consists of multiple threads.
  • Thread: Represents a single execution logic flow in a program, is an independently processed execution path, and is a lightweight process.
  • Multithreading (multithreading): multiple threads to execute the program. A form of concurrency, but not the only way.
  • Parallel processing: Divide a large number of tasks being executed into small pieces and assign them to multiple threads running at the same time. A type of multithreading.
  • Asynchronous programming: A form of concurrency, using future mode or callback (callback) mechanism to avoid unnecessary threads.

asynchronous programming

Asynchronous programming does not have to be implemented with multithreading, and multithreading is just one of the means of implementation. In .Net, the new future types include Task and Task.
Callbacks or events are used in old-fashioned asynchronous programming APIs. The core idea of ​​asynchronous programming is asynchronous operations.

  • Asynchronous operations: Started operations will complete after some time. When this operation is performed, the original thread will not be blocked. The thread that started this operation can continue to perform other tasks. When an operation completes, its future is notified or a callback function is invoked to let the program know that the operation has ended.
  • Responsive programming: A declarative programming mode in which the program responds to events, which is different from asynchronous programming because it is based on asynchronous events (asynchronous
    evnt). A form of concurrent programming.

I/O-intensive vs. compute-intensive

  • I/O-bound: An operation is called I/O-bound if it spends most of its time waiting for a condition to occur.
  • Compute-bound: An operation is said to be compute-bound if it spends most of its time performing CPU-bound operations.

concurrent programming

The key feature of good software is concurrency, the program is doing more things at the same time, rather than the single request and single response we saw in the past. Intelligent, high-user-experience programs are inseparable from concurrent programming.

Parallel Programming (Parallel Framework)

foreword

Parallel programming: Exploiting multiple cores or multiple processors by means of coding is called parallel programming, a subset of the concept of multithreading.

Parallel processing: Divide a large number of tasks being executed into small pieces and assign them to multiple threads running at the same time. A type of multithreading.

Parallel programming is divided into the following structures:

1. Parallel LINQ or PLINQ

2. Parallel class

3. Task Parallel Structure

4. Concurrent Collections

5. SpinLock and SpinWait

These are features introduced in .NET 4.0 and are generally referred to as PFX (Parallel Framework).

The Parallel class and task parallel structure are called TPL (Task Parallel Library, Task Parallel Library).

Parallel Framework (PFX)

1. Parallel Framework Basics

Standard single-threaded code doesn't automatically run faster as current CPU technology hits a bottleneck and manufacturers shift their focus to improving core technology.
Using multiple cores to improve program performance usually requires some processing of computationally intensive code:
1. Divide the code into blocks.
2. Execute these code blocks in parallel through multiple threads.
3. Once the results become available, consolidate those results in a thread-safe and performant manner.
Although the traditional multi-thread structure realizes the function, it is quite difficult and inconvenient, especially the steps of dividing and sorting (the essential problem is: when multiple threads use the same data at the same time, the common strategy of locking for thread safety will cause a lot of competition. ).
The parallel framework (Parallel Framework) is designed to help in these application scenarios.

2. Parallel frame composition

PFX: The upper layer consists of two data parallel APIs: PLINQ or the Parallel class. The bottom layer contains task-parallel classes and a set of additional constructs to aid in parallel programming.
insert image description here

Basic Parallel Language-Integrated Query (PLINQ)

Language Integrated Query (LINQ) provides a simple syntax to query data collections. And this way of sequentially processing data sets by one thread is called sequential query.

Parallel Language Integrated Query (Parallel LINQ) is a parallel version of LINQ. It converts sequential queries into parallel queries, using tasks internally to spread the processing of data items in a collection across multiple CPUs to process multiple data items concurrently.

PLINQ will automatically parallelize native LINQ queries. The System.Linq.ParallelEnumerable class (which is defined in System.Core.dll and requires a reference to System.Linq) exposes parallel versions of all standard LINQ operators. All these methods are extended from System.Linq.ParallelQuery.

1.LINQ to PLINQ
To make LINQ queries call the parallel version, you must convert your sequential queries (based on IEnumerable or IEnumerable) into parallel queries (based on ParallelQuery or ParallelQuery), and use the AsParallel method of ParallelEnumerable to implement, such as the example: 1. PLINQ
execution Model
insert image description here

Parallel class

The Parallel class is a nice abstraction over threads. This class is located in the System.Threading.Tasks namespace and provides data and task parallelism.

PFX provides a basic form of structured parallelism through three static methods in the Parallel class:

1.Parallel.Invoke

Parallel.Invoke: Used to execute a group of delegates in parallel, examples are as follows:

task parallelism

For task parallel content, please poke Task (Task) and asynchronous programming (async&await).

2.Parallel.For

Parallel.For: Executes the parallelized equivalent loop of the C# for loop, examples are as follows:

class ParallelDemo
    {
    
    
        static void Main(string[] args)
        {
    
    
            //顺序循环
            {
    
    
                for (int i = 0; i < 10; i++)
                {
    
    
                    Test(i);
                }
            }
            Console.WriteLine("并行化for开始");
            //顺序执行转换为并行化
            {
    
    
                Parallel.For(0, 10, i => Test(i));
            }
            //顺序执行转换为并行化(更简单的方式)
            {
    
    
                Parallel.For(0, 10, Test);
            }
            Console.ReadKey();
        }
        static void Test(int i)
        {
    
    
            Console.WriteLine($"当前线程Id:{
      
      Thread.CurrentThread.ManagedThreadId},输出结果为:{
      
      i}");
        }
    }

3.Parallel.ForEach

Parallel.ForEach: Executes the parallelized equivalent loop of the C# foreach loop, examples are as follows:

static void Main(string[] args)
        {
    
    
            顺序循环
            //{
    
    
            //    for (int i = 0; i < 10; i++)
            //    {
    
    
            //        Test(i);
            //    }
            //}
            //Console.WriteLine("并行化for开始");
            顺序执行转换为并行化
            //{
    
    
            //    Parallel.For(0, 10, i => Test(i));
            //}
            顺序执行转换为并行化(更简单的方式)
            //{
    
    
            //    Parallel.For(0, 10, Test);
            //}

            string[] data = {
    
     "zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine" };
            //顺序循环
            {
    
    
                foreach (string num in data)
                {
    
    
                    Test(num);
                }
            }
            Console.WriteLine("并行化foreach开始");
            //顺序执行转换为并行化
            {
    
    
                Parallel.ForEach(data, num => Test(num));
            }
            Console.ReadKey();
            Console.ReadKey();
        }
        static void Test(int i)
        {
    
    
            Console.WriteLine($"当前线程Id:{
      
      Thread.CurrentThread.ManagedThreadId},输出结果为:{
      
      i}");
        }
        static void Test(string i)
        {
    
    
            Console.WriteLine($"当前线程Id:{
      
      Thread.CurrentThread.ManagedThreadId},输出结果为:{
      
      i}");
        }

insert image description here

4. Index & jump out (ParallelLoopState)

Sometimes iterative index is very useful, but it should not be used in the same way as sequential loop usage using shared variables (i++ inside the loop), because shared variable values ​​​​are thread-unsafe in parallel contexts.

Similarly, because the loop body in the parallel For or ForEach is a delegate, the break statement cannot be used to exit the loop early, and the Break or Stop method on the ParallelLoopState object must be called.

Taking ForEach as an example, one of the ForEach overloads is as follows, which contains three parameters of Acton (TSourec=child element, ParallelLoopState=parallel loop state, long=index):

 public static ParallelLoopResult ForEach<TSource>(IEnumerable<TSource> source, Action<TSource, ParallelLoopState, long> body)
``

```csharp
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading;
using System.Threading.Tasks;

namespace ConsoleApp2
{
    
    
    class ParallelDemo
    {
    
    
        static void Main(string[] args)
        {
    
    
            顺序循环
            //{
    
    
            //    for (int i = 0; i < 10; i++)
            //    {
    
    
            //        Test(i);
            //    }
            //}
            //Console.WriteLine("并行化for开始");
            顺序执行转换为并行化
            //{
    
    
            //    Parallel.For(0, 10, i => Test(i));
            //}
            顺序执行转换为并行化(更简单的方式)
            //{
    
    
            //    Parallel.For(0, 10, Test);
            //}

            string[] data = {
    
     "zero", "one", "two", "three", "four", "five", "six", "seven", "eight", "nine" };
            //顺序循环
            //{
    
    
            //    foreach (string num in data)
            //    {
    
    
            //        Test(num);
            //    }
            //}
            Console.WriteLine("并行化foreach开始");
            //顺序执行转换为并行化
            {
    
    
                Parallel.ForEach(data, (num , state, i)=>
                {
    
    
                    Console.WriteLine($"当前索引为:{
      
      i},状态为:{
      
      state}");
                    Test(num);
                    if (num == "six")
                        state.Break();
                });

               
            }
            Console.ReadKey();
            Console.ReadKey();
        }
        static void Test(int i)
        {
    
    
            Console.WriteLine($"当前线程Id:{
      
      Thread.CurrentThread.ManagedThreadId},输出结果为:{
      
      i}");
        }
        static void Test(string i)
        {
    
    
            Console.WriteLine($"当前线程Id:{
      
      Thread.CurrentThread.ManagedThreadId},输出结果为:{
      
      i}");
        }
    }
}
    


Concurrent Collections Overview

.NET 4.0 provides a new set of collections in the System.Collections.Concurrent namespace. All these collections are completely thread-safe:

These collections not only provide shortcuts for using ordinary collections with locks, but also can use concurrent collections in general multithreading, but you need to pay attention: 1.
Concurrent collections are adjusted for parallel programming. Traditional collections outperform them only in highly concurrent application scenarios.

2. A thread-safe collection does not ensure that the code that uses it is also safe.

3. If another thread modifies a concurrent collection while enumerating it, no exception will be thrown, and instead, a mixture of old and new content will be obtained.

4. There is no concurrent version of List.

5. Their memory utilization is not as efficient as the non-concurrent Stack and Queue classes, but they are better for concurrent access.

1. Structural overview
The difference between these concurrent collections and traditional collections is that they expose special methods to perform atomic tests and actions, and these methods are provided through the IProducerConsumerCollection interface.

The IProducerConsumerCollection interface represents a thread-safe producer/consumer collection. These three classes inherit and implement the IProducerConsumerCollection interface:

ConcurrentStack、ConcurrentQueue、ConcurrentBag。

The TryAdd and TryTake methods they implement are used to test whether an add/delete operation can be performed, and if so, perform the add/delete operation. Tests and actions do not require locking on legacy collections.

ConcurrentBag is used to save an unneeded collection of objects, and is suitable for situations where you don't care about getting that element when calling Take or TryTake.

Compared with concurrent queues or stacks, there is no competition when multiple threads call Add of a ConcurrentBag at the same time, but parallel calls to Add by queues or stacks will cause some competition, so calling the Take method on ConcurrentBag is very efficient.

BlockingCollection is similar to a blocking collection, which is suitable for waiting for the appearance of new elements. It can be regarded as a container, using a blocking collection to encapsulate all collections that implement IProducerConsumerCollection, and allowing elements to be removed from the encapsulated collection. If there is no element, the operation will block

2. Basic methods
Some commonly used methods are organized from zy__:

ConcurrentQueue: It is completely lock-free, but it may fall into spin and retry the operation when facing resource competition failure.

Enqueue: insert elements at the end of the queue

TryDequeue: try to delete the head element and return it through the out parameter

TryPeek: Try to return the opposite element through the out parameter, but do not delete the element.

ConcurrentStack: completely lock-free, but may fall into spin and retry the operation when facing resource competition failure.

Push: Insert an element to the top of the stack

TryPop: Pop the element from the top of the stack and return it through the out parameter

TryPeek: Returns the top element of the stack without popping it.

ConcurrentBag: An unordered collection into which a program can insert elements or delete elements. It is very efficient to insert and delete elements in the collection in the same thread.

Add: Insert an element into the collection

TryTake: Take an element from the collection and delete it

TryPeek: Removes an element from the collection, but does not delete the element.

BlockingCollection: A container that supports bounds and blocking

Add : Insert an element into the container

TryTake: Take the element out of the container and delete it

TryPeek: Removes elements from the container, but does not delete them.

CompleteAdding: Tells the container that adding elements is complete. At this point, if you want to continue adding, an exception will occur.

IsCompleted: Tell the consumer thread that the producer thread is still running and the task has not been completed.

ConcurrentDictionary: It is completely lock-free for read operations, and it uses fine-grained locks when many threads want to modify data.

AddOrUpdate: The method adds a new key and value to the container if the key does not exist, or updates the existing key and value if it exists.

GetOrAdd: If the key does not exist, the method will add a new key and value to the container, and if it exists, return the existing value without adding a new value.
TryAdd: Attempts to add a new key and value to the container.

TryGetValue: Try to get the value based on the specified key.

TryRemove: Try to remove the specified key.

TryUpdate: Conditionally update the value corresponding to the current key.

GetEnumerator: Returns an enumerator that can traverse the entire container.

Guess you like

Origin blog.csdn.net/kalvin_y_liu/article/details/128205405