PHP Spider

Note: To establish the corresponding database field, you must run the program in cli mode

1. Install phpspider through composer

composer require owner888/phpspider

2. Go directly to the code

<?phprequire'/vendor/autoload.php';use phpspider\core\phpspider;//Call the configuration of phpspider $configs = array('name' =>'简书','log_show' =>false,'tasknum' => 1,//Database configuration'db_config' => array('host' => '127.0.0.1','port' => 3306,'user' =>'root','pass' =>'root' ,'name' =>'demo',),'export' => array('type' =>'db','table' =>'cof', // table name),//list of domain names to be crawled 'domains' => array('jianshu','www.jianshu.com'),//The starting point of crawling'scan_urls' => array('https://www.jianshu.com/c/V2CqjW?utm_medium= index-collections&utm_source=desktop'),//List page example'list_url_regexes' => array("https://www.jianshu.com/c/\d+"),//内容页实例//  \d+  指的是变量'content_url_regexes' => array("https://www.jianshu.com/p/\d+",),'max_try' => 5,//数据库字段'fields' => array(array('name'     => "title",'selector' => "//h1[@class='title']",'required' => true,),array('name'     => "content",'selector' => "//div[@class='show-content-free']",'required' => true,),),);$spider = new phpspider($configs);$spider->start(); "content",'selector' => "//div[@class='show-content-free']",'required' => true,),),);$spider = new phpspider($configs);$spider->start(); "content",'selector' => "//div[@class='show-content-free']",'required' => true,),),);$spider = new phpspider($configs);$spider->start();

Guess you like

Origin blog.csdn.net/nk90875/article/details/112981730