python 2.7 multiprocessing share in the dictionary dict process security thread safety test

In the development process, the need to use multiple processes to develop high-performance multi-threading to aim cpu run over, run over the bandwidth, but found a lot of shared variables in the course, call priority issue.

In this paper, security thread shared dictionary python multiprocessing module 2.7 in the test. Code directly on the complete analysis.

#!/usr/bin/python
# coding=utf-8

'''
测试 multiprocessing 中 dict 的共享特征
'''


import json
import threading
import os
from multiprocessing import Process, Lock, Manager


# 测试 共享 dict 中 常规 list 如何能够添加
def deal(lock, share_dict):
    share_dict[os.getpid()] = ["a"]
    lock.acquire()
    mydict = dict(share_dict)  # 注意,共享dict无法直接dumps,会报类型错误,必须先转换为普通字典
    json.dumps(mydict)
    print mydict
    lock.release()


# 测试 共享 dict 中 常规 list 如何能够添加
def deal2(lock, share_dict):
    for i in range(10):
        # lock.acquire()
        share_dict["hello"] += 1
        # lock.release()
    lock.acquire()
    mydict = dict(share_dict)  # 注意,共享dict无法直接dumps,会报类型错误,必须先转换为普通字典
    json.dumps(mydict)
    print mydict
    lock.release()


# 测试线程
def thread_test(share_dcit):
    share_dcit["index"] += 1
    pass


# 测试在多线程下共享变量的线程安全性
def deal3(share_dcit):
    threads_num = 20
    for i in range(threads_num):
        t = threading.Thread(target=thread_test, args=(share_dcit, ))
        t.daemon = True
        t.start()
    pass


def test():
    p_num = 10
    process = list()
    lock = Lock()
    m = Manager()
    share_dict = m.dict()  # 多进程共享变量 字典
    for i in xrange(p_num):
        process.append(Process(target=deal, args=(lock, share_dict, )))
        # share_dict["hello"] = 0
        # process.append(Process(target=deal2, args=(lock, share_dict, )))
        # share_dict["index"] = 100
        # process.append(Process(target=deal3, args=(share_dict, )))
    for p in process:
        p.start()
    for p in process:
        p.join()
    print share_dict


if __name__ == '__main__':
    test()

This code mainly for the three-stage test, deal () function to test how dict share in conventional list can be added the problem, thread safety deal2 () dict share in the process of testing the safety, deal3 () dict share the test.

deal () dict share add elements in list

Performing functional testing and found to share dict elements in the list, use the append is no way to add a new element

def deal(lock, share_dict):
    share_dict[os.getpid()] = ["a"]
    share_dict[os.getpid()].append("b")
    lock.acquire()
    mydict = dict(share_dict)  # 注意,共享dict无法直接dumps,会报类型错误,必须先转换为普通字典
    json.dumps(mydict)
    print mydict
    lock.release()

After using append you can see the shared dict is not added to the "b".
Here Insert Picture Description
The append () function is replaced by the +, may be added to the normal elements

def deal(lock, share_dict):
    share_dict[os.getpid()] = ["a"]
    # share_dict[os.getpid()].append("b")
    share_dict[os.getpid()] += ["b"]
    lock.acquire()
    mydict = dict(share_dict)  # 注意,共享dict无法直接dumps,会报类型错误,必须先转换为普通字典
    json.dumps(mydict)
    print mydict
    lock.release()

Here Insert Picture Description

Share dict process safety

The sample code to modify annotations

# process.append(Process(target=deal, args=(lock, share_dict, )))
share_dict["hello"] = 0
process.append(Process(target=deal2, args=(lock, share_dict, )))
# share_dict["index"] = 100
# process.append(Process(target=deal3, args=(share_dict, )))

This code modify a shared variable share_dict [ "hello"] counter, in the case where the lock mechanism is not applicable, as can be seen in FIG executes, program settings 10 processes, each increment in the cycle 10 times, in the case should yield positive counter 100, now only two to 83, indicate that there is an error during the self-energizing concurrent computing.


Use lock mechanism locked, you can see the counter to normal, to 100.

lock.acquire()
share_dict["hello"] += 1
lock.release()

Here Insert Picture Description

The shared thread safe dict

The sample code to modify annotations

# process.append(Process(target=deal, args=(lock, share_dict, )))
# share_dict["hello"] = 0
# process.append(Process(target=deal2, args=(lock, share_dict, )))
share_dict["index"] = 100
process.append(Process(target=deal3, args=(share_dict, )))

Thread safety tests, the 10 process, which started 20 threads, according to the normal calculation, counter share_dcit [ "index"] should go to 200, and now look at the results of the count.
Here Insert Picture Description
Counter only to the final 77, explained in the course of running threads, there has been a lot of concurrency issues.

When using the process lock for locking process, namely

def thread_test(lock, share_dcit):
    lock.acquire()
    share_dcit["index"] += 1
    lock.release()
    pass

Find concurrency issues more important, you can see, the process does not work on the thread and
Here Insert Picture Description
thought of the incoming thread lock used in the process, as shown below, you can see the use of thread-locking python error.
Here Insert Picture Description

in conclusion

  1. multiprocesses shared variables do not have the security process and thread safety, in the course of operations to be locking themselves. Meanwhile, further studies of certain deep copy mechanism requires special properties in later use.
  2. Multiple processes simultaneously open multiple threads, there is no python find a way to directly control variables may be, will continue to look for on the Internet, alternative means may be used directly save the results in a dict or other shared variables, and then in the main process in this statistical variables, rather than using the direct counters.

I went to look at the source code and found that the registered source is shared dict

SyncManager.register('Queue', Queue.Queue)
SyncManager.register('JoinableQueue', Queue.Queue)
SyncManager.register('Event', threading.Event, EventProxy)
SyncManager.register('Lock', threading.Lock, AcquirerProxy)
SyncManager.register('RLock', threading.RLock, AcquirerProxy)
SyncManager.register('Semaphore', threading.Semaphore, AcquirerProxy)
SyncManager.register('BoundedSemaphore', threading.BoundedSemaphore,
                     AcquirerProxy)
SyncManager.register('Condition', threading.Condition, ConditionProxy)
SyncManager.register('Pool', Pool, PoolProxy)
SyncManager.register('list', list, ListProxy)
SyncManager.register('dict', dict, DictProxy)
SyncManager.register('Value', Value, ValueProxy)
SyncManager.register('Array', Array, ArrayProxy)
SyncManager.register('Namespace', Namespace, NamespaceProxy)

# types returned by methods of PoolProxy
SyncManager.register('Iterator', proxytype=IteratorProxy, create_method=False)
SyncManager.register('AsyncResult', create_method=False)

Registration function is as follows, looking not very good, do not want to study -_-, did not find a similar mechanism in place Lock

@classmethod
def register(cls, typeid, callable=None, proxytype=None, exposed=None,
             method_to_typeid=None, create_method=True):
    '''
    Register a typeid with the manager type
    '''
    if '_registry' not in cls.__dict__:
        cls._registry = cls._registry.copy()

    if proxytype is None:
        proxytype = AutoProxy

    exposed = exposed or getattr(proxytype, '_exposed_', None)

    method_to_typeid = method_to_typeid or \
                       getattr(proxytype, '_method_to_typeid_', None)

    if method_to_typeid:
        for key, value in method_to_typeid.items():
            assert type(key) is str, '%r is not a string' % key
            assert type(value) is str, '%r is not a string' % value

    cls._registry[typeid] = (
        callable, exposed, method_to_typeid, proxytype
        )

    if create_method:
        def temp(self, *args, **kwds):
            util.debug('requesting creation of a shared %r object', typeid)
            token, exp = self._create(typeid, *args, **kwds)
            proxy = proxytype(
                token, self._serializer, manager=self,
                authkey=self._authkey, exposed=exp
                )
            conn = self._Client(token.address, authkey=self._authkey)
            dispatch(conn, None, 'decref', (token.id,))
            return proxy
        temp.__name__ = typeid
        setattr(cls, typeid, temp)

In these two days of work, I think of a way to control the multi-threaded multi-process shared variable access. The method is in the use of shared variables, first with a thread-locking locking within a single process, then the process lock to lock, so that we can ensure that the shared variable accessed by only one thread at a time.

# 测试线程3 总计运行 10 * 20 * 1000 次
# 未加锁 Total time 1.55900001526   count=2576
# 加锁 Total time 1.45799994469 count=200000
def thread_test3(process_lock, thread_lock, count):
    for i in range(1000):
        thread_lock.acquire()
        process_lock.acquire()
        count.value += 1
        process_lock.release()
        thread_lock.release()
    pass


# 测试在多线程 使用进程锁+线程锁是否能够保证变量的线程安全
def deal5(process_lock, count):
    threads_num = 20
    threads = []
    for i in range(threads_num):
        thread_lock = threading.Lock()
        t = threading.Thread(target=thread_test3, args=(process_lock, thread_lock, count, ))
        threads.append(t)
        t.daemon = True
        t.start()
    for thread in threads:
        thread.join()
    pass
Published 16 original articles · won praise 1 · views 302

Guess you like

Origin blog.csdn.net/m0_46232048/article/details/104115434