Analog MapReduce, Python multithreading

MapReduce is a programming model, and also a processing algorithm generates a correlation model of large data sets. A user first creates a Map function processing based on the data key / value pair is set, based on the data key / value pair of a set of intermediate output; and then create a Reduce function used to combine all intermediate values ​​having the same value of the intermediate key value.

Simulate a simple map reduce programming

# ### implements a map reduce programming. 
# ### There are a series of numbers as input. Each number to get more than 7. Finally, adding the remainder 
Import Time 
mylist = [134,43,49,34,1,34,89,133,13434,379,134,4343,13434,34454,343,134 ]
 DEF Surplus (myNum): 
   A = myNum. 7%
    Print A
    # ## in order to observe the effect, addition of SLEEP 
   the time.sleep (. 1 )
    return A 
DEF plus_all (mylist): 
   mySum = 0
    for onesurplus in Map (Surplus, mylist): 
      mySum = mySum + onesurplus
    return mySum
 IF __name__ == '__main__':
   print (plus_all(mylist))
以上实现了 一个最简单的map reduce 变成模型,只不过map任务仍然是单线程。在map的调用替换成多任务并发即可。以下用4线程并发调起map()。futures.ProcessPoolExecutor()默认调起线程是cpu的线程数。
# ### implements a map reduce programming. 
# ### There are a series of numbers as input. Each number to get more than 7. Finally, adding the remainder 
Import Time
 from Concurrent Import Futures 
mylist = [134,43,49,34,1,34,89,133,13434,379,134,4343,13434,34454,343,134 ]
 DEF Surplus (myNum): 
   A =% myNum . 7
    # Print (A) 
   # ## In order to observe the effect, addition of SLEEP 
   the time.sleep (. 1 )
    return A 
DEF plus_all (mylist): 
   mySum = 0 
   with futures.ProcessPoolExecutor ( . 4 ) the pool AS:
       for onesurplusin pool.map(surplus,mylist):
         mysum=mysum+onesurplus
   return mysum
if __name__ == '__main__':
   print (plus_all(mylist))

to sum up:

1, map reduce programming model, the first task into a plurality of parts by the same process flow. Write map function, the function returns a result set is fixed,

2, with concurrent threads invoking map task. All processing map returns results.

Guess you like

Origin www.cnblogs.com/vansky/p/12484494.html