Parallel Algorithms: How to use parallel processing to improve the efficiency of the algorithm

Parallel Algorithms: How to use parallel processing to improve the efficiency of the algorithm

The purpose is to improve the efficiency of the algorithm code execution when the case can not continue optimization algorithm, which is how to further improve the efficiency? With the transformation of how the algorithm processing idea of ​​parallel computing?

Parallel sorting

To the size of the data 8GB sort, and our memory can hold so much data one time, for sorting, the most commonly used is the time complexity of O (nlogn) three sorting algorithms: merge sort, quick sort, heap sort, from the algorithmic level we can not continue to optimize, and the use of parallel processing idea, can easily raise the efficiency of this many times 8GB data sorting problem.

One pair of parallel merge sort processing

After these may 8GB data into small data sets 16, each set contains 500MB of data, with the threads 16, 16 parallel data 500MB of this sort, which are a set of 16 small sort is complete, then these 16 ordered collection of merger

Two pairs of parallel processing quicksort

Found via which the data over the data window interval, this interval is divided into 16 small to large and small sections, dividing the data to the corresponding 8GB interval, data for a small section 16 which, starting threads 16, parallel Sort, until the data is ordinal data after 16 execution threads are completed, the resulting

Thought partition are both used, the data slice, the parallel-processing, except that the first idea is to randomly recombining fragments of data after sorting, the second is divided according to the size of the data to interval, and then sorting, drained without reprocessing the sequence

If you want to sort the data size is 1TB, the problem is efficiency of data read, sorting process, frequent disk read and write data, how to reduce disk IO operations, reduce disk reads and writes data of the total amount, become the focus of optimization

Parallel to find

If the dynamic data structure to the index member, the continuously added to the process data, the load factor of the hash table growing need dynamic expansion, if we give a 2GB size of the hash table for expansion, extended to 1.5 times the original, i.e., 3GB, memory utilization is only 60%, we can random data into K portions, each of the data only the original 1 k \frac{1}{k} , K for small data sets which were constructed hash table, the hash table when a small load factor is too large, only a small separate hash table for this expansion, the hash table does not require additional expansion, the plurality of small hash table treatment methods efficient than large hash tables, when we want to find some data when only through K parallel threads to find data in the K hash table, greatly improved performance, add data, it can be the new loading data into the smallest factor in the hash table, hash help reduce conflict

Parallel String Matching

Find a keyword in the text can be achieved through the string matching algorithm, look for keywords in a text is not very long time, KMP, BM, RK, BF is very efficient, but handling super large text, how to speed up matching speed?

The big text small text into K, assuming that K = 16, start 16 threads, to find a parallel in this 16 small text keyword, so performance is increased by 16 times, and originally included in this large text keywords, is divided into two, the text is divided into two small, resulting in large although the keywords contained in the text, but not find, that require special handling in the 16 small text, if the keyword of length M at the end of each start and a small text string from each of M, M at the end of a small text characters before and after M symbols at the beginning of a text consisting of a small string of length 2M, then let keyword, rediscover this string of 2M

Parallel search

Breadth first search is a search strategy for searching a layer by layer, this layer is based on the current vertex, a plurality of starting threads, the next layer vertex parallel search, the original code BFS by a queue record but has to traverse no extension of the apex, and now need two queues to complete the expansion work vertex

These two queues are queues A, B and queue, multi-threaded parallel processing queue vertex A, vertex and the resulting expanded B stored in the queue, the queue after the other in the vertex A extensions are completed, the queue empty A, and then extended in parallel in the vertex queue B, extend out of the vertices a, stored in a queue, recycling, breadth-first search algorithm to achieve parallel

Extended summary

If we have N tasks, hoping to parallel execution of tasks, but each task there is a certain dependency, find out how tasks can be executed in parallel according to the dependencies?

With a directed task dependencies between storage map, then topological sort of thought to perform the task, each time looking into the degree to 0, on the queue, start the thread pool starts executing, the queue of tasks in parallel finished, called again to find the topological sort of task 0 placed in queue until all tasks finish

Published 75 original articles · won praise 9 · views 9164

Guess you like

Origin blog.csdn.net/ywangjiyl/article/details/104893040