Data Mining_Introduction to Parallel Concurrency and Synchronous Asynchronous Introduction

 

The request s and spynner mentioned earlier are single-process (single-threaded) sequential fetching, and asynchronous fetching with concurrent and parallel execution will greatly improve the fetching efficiency.

 

Parallelism and Concurrency

Concurrency and parallel use two similar concepts. Concurrency refers to the situation where several events occur within a period of time, while parallelism refers to the situation where several events occur at the same time.

We can illustrate these two concepts in terms of how CPUs work

 

Under a single-core CPU, the tasks of the multitasking operating system run in a concurrent manner. Because there is only one processor, each task will occupy the CPU in a time-sharing manner and execute sequentially within a certain event. If the CPU is not completed within the time period, then it will continue to execute until the next time you get the right to use the CPU, until the entire task is completed, because the switching speed of the CPU is very fast, so it gives us the feeling that multiple tasks are running at the same time.

 

Under a multi-core CPU, because there are more than two (including two) cores that can work at the same time, it is possible that tasks running on each core can be performed at the same time, which is called parallelism

 

Through this picture you should be able to understand better (serial is not in the scope of our consideration)

 

 

synchronous and asynchronous

The concepts of synchronization and asynchrony generally involve the participation of multiple tasks or events. Both concepts can be understood in the context of concurrency or parallelism

 

Synchronization means that tasks that occur concurrently or in parallel do not run in isolation, and that one task may need to be performed after obtaining the results given by another task.

In other words, only after one task completes or gives a result, another task can continue to run after obtaining this result

In short, the operation of each task will restrict each other, and the rhythm and pace must be well coordinated, otherwise errors will occur

 

Asynchronous means that tasks that occur concurrently or in parallel run independently of each other and are not affected by each other. This is the main difference between asynchronous and synchronous.

 

 

How to choose in practice

When multiple tasks are required to cooperate with each other to complete in a concurrent or parallel environment, they need to run in a synchronous manner

The operating system uses the semaphore mechanism to achieve synchronization, and various programming languages ​​also provide synchronization mechanisms for parallel or concurrent programming.

For example, Python's lock mechanism in multi-threaded design

 

 

When we need to decompose a large task into several small subtasks, each subtask can be completed independently without mutual cooperation, so we can consider making them asynchronously parallel or asynchronously concurrently executed

This is a common pattern in data scraping. The links to be scraped are divided into groups, and each group is crawled using subtasks. After each subtask has been crawled, the results are summarized. , complete the task

 

About these concepts are probably these, if you don't understand too well, you can check the information yourself to understand

 

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325028522&siteId=291194637