curl_multi implements concurrency

normal request

curl_normal.php

<?php 
$srart_time = microtime(TRUE);

$chArr=[];

//创建多个cURL资源
for($i=0; $i<10; $i++){
    $chArr[$i]=curl_init();
    curl_setopt($chArr[$i], CURLOPT_URL, "http://www.52fhy.com/test.json");
    curl_setopt($chArr[$i], CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($chArr[$i], CURLOPT_TIMEOUT, 1);
    $result[] = curl_exec($chArr[$i]);
    echo "running ";
}

// print_r($result);

$end_time = microtime(TRUE);
echo sprintf("use time:%.3f s", $end_time - $srart_time);

?> 

use time:0.830 s

curl_multi concurrency

curl_multi.php

<?php 
$srart_time = microtime(TRUE);

$chArr=[];

//创建多个cURL资源
for($i=0; $i<10; $i++){
    $chArr[$i]=curl_init();
    curl_setopt($chArr[$i], CURLOPT_URL, "http://www.52fhy.com/test.json");
    curl_setopt($chArr[$i], CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($chArr[$i], CURLOPT_TIMEOUT, 1);
}

$mh = curl_multi_init(); //1 创建批处理cURL句柄

foreach($chArr as $k => $ch){      
    curl_multi_add_handle($mh, $ch); //2 增加句柄
}

$active = null; 

//待优化点:
//在$active > 0,执行curl_multi_exec($mh,$active)而整个批处理句柄没有全部执行完毕时,系统会不停地执行curl_multi_exec()函数。
do{
    echo "running ";
    curl_multi_exec($mh, $active); //3 执行批处理句柄
}while($active > 0); //4

foreach($chArr as $k => $ch){ 
    $result[$k]= curl_multi_getcontent($ch); //5 获取句柄的返回值
    curl_multi_remove_handle($mh, $ch);//6 将$mh中的句柄移除
}

curl_multi_close($mh); //7 关闭全部句柄 

// print_r($result);

$end_time = microtime(TRUE);
echo sprintf("use time:%.3f s", $end_time - $srart_time);

?> 

use time:0.259 s

curl_multi concurrency optimization: curl_multi_select

In the previous example, $active > 0the system will continue to execute the function curl_multi_exec($mh,$active)when the entire batch handle is not fully executed . curl_multi_exec()This can easily lead to high CPU usage.

The way to make changes is to apply the curl_multi_select() function in the curl function library, whose function prototype is as follows:

int curl_multi_select ( resource $mh [, float $timeout = 1.0 ] )

Blocks until there is an active connection in the cURL batch connection. Returns the number of descriptors in the descriptor set on success. On failure, return -1 if select fails, otherwise return timeout (from the underlying select system call).
We use the curl_multi_select() function to achieve the goal of blocking when there are no programs that need to read.

Here is the code for the optimization part:

curl_multi_select.php

$active = null; 

do{
    echo "running ";
    $mrc = curl_multi_exec($mh, $active); //3 执行批处理句柄
}while ($mrc == CURLM_CALL_MULTI_PERFORM); //4

//本次循环第一次处理$mh批处理中的$ch句柄,并将$mh批处理的执行状态写入$active ,当状态值等于CURLM_CALL_MULTI_PERFORM时,表明数据还在写入或读取中,执行循环,当第一次$ch句柄的数据写入或读取成功后,状态值变为CURLM_OK,跳出本次循环,进入下面的大循环之中。

//$active 为true,即$mh批处理之中还有$ch句柄正待处理,$mrc==CURLM_OK,即上一次$ch句柄的读取或写入已经执行完毕。
while ($active && $mrc == CURLM_OK) { 
    if (curl_multi_select($mh) != -1) {//$mh批处理中还有可执行的$ch句柄,curl_multi_select($mh) != -1程序退出阻塞状态。
        do {
            $mrc = curl_multi_exec($mh, $active);//继续执行需要处理的$ch句柄。
        } while ($mrc == CURLM_CALL_MULTI_PERFORM);
    }
}

The advantage of this execution is $mhthat the handles in the batch will enter the blocking stage $chafter reading or writing data ( ) , instead of continuously executing curl_multi_exec during the execution of the entire batch, wasting CPU resources.$mrc==CURLM_OKcurl_multi_select($mh)$mh

Running result:
use time: 0.325 s

The time consumption hasn't changed much, just the performance has improved.

curl_multi concurrency optimization: rolling

The above example still has room for optimization. The optimization method is to process a URL as soon as possible after the request is completed, and wait for other URLs to return while processing, instead of waiting for the slowest interface to return before starting processing. Wait for work to avoid CPU idleness and waste.

Posting only the modified part:

curl_multi_rolling.php


$active = null; 

do {
    while (($mrc = curl_multi_exec($mh, $active)) == CURLM_CALL_MULTI_PERFORM) ;

    if ($mrc != CURLM_OK) { break; }

    // a request was just completed -- find out which one
    while ($done = curl_multi_info_read($mh)) {

        // get the info and content returned on the request
        $info = curl_getinfo($done['handle']);
        $error = curl_error($done['handle']);
        $result[] = curl_multi_getcontent($done['handle']);
        // $responses[$map[(string) $done['handle']]] = compact('info', 'error', 'results');

        // remove the curl handle that just completed
        curl_multi_remove_handle($mh, $done['handle']);
        curl_close($done['handle']);
    }

    // Block for data in / output; error handling is done by curl_multi_exec
    if ($active > 0) {
        curl_multi_select($mh);
    }

} while ($active);

use time:0.267 s

refer to

1. The fifth curl basic use and multi-threading optimization of PHP simulation POST request
http://www.cnblogs.com/zhenbianshu/p/4935679.html
2. Rolling cURL: PHP concurrency best practice
https://www.oschina .net/question/54100_58279
3. curl_multi_select solves the curl_multi web page suspended animation
http://www.webkaka.com/tutorial/php/2013/102844/

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324770780&siteId=291194637