Several Python threads pit, and the pit connection pool

urlretrieve not timed out, you need to be set up through the socket 

socket.setdefaulttimeout(10)

But also the need to set up a connection pool for him, so the switch to direct requests to download the file

def download_file(self, url, filename):
    r = self.session.get(url, stream=True)
    with open(filename, 'wb') as f:
        for chunk in r.iter_content(chunk_size=512):
            if chunk:
                f.write(chunk)

 

 

Write native reptiles have problems can`t start new thread on your machine has no problem not found to leak out the storm on someone else's machine.

The reason is native thread does not destroy exit after the execution is complete, but into a sleeping state, leading to the final thread creation exceeds the maximum allowed. In fact, some of the acts by modifying the initialization of the Thread, the thread can be reused.

Or simply, using a thread pool to solve

from concurrent.futures.thread import ThreadPoolExecutor

def thread_run(target, args_list, max_thread=12):
    with ThreadPoolExecutor(max_thread) as executor:
        for arg in args_list:
            executor.submit(target, arg)

 

Another problem is the Connection pool is full, discarding connection

Can be set as follows

session.mount(prefix='', adapter=HTTPAdapter(pool_connections=1, pool_maxsize=36, max_retries=1))

But still pool is full will appear in multiple threads. I maxsize threads than the number set slightly larger point, there is no warning, it could be my code is still hidden problems.

 

May also be related with the thread pool, thread pool temporarily did not see the source code, if such can be locked by semaphore

from threading import Semaphore

class AA():
    sem = Semaphore(12)
    
    ...

    def getHtml():
        sem.acquire()
        session.get()
        sem.release()

 

Guess you like

Origin www.cnblogs.com/wrnmb/p/11314660.html