This article describes how to crawl and pieces taxi coupons
First, the data sources
Coupon Mom.
Second, grab method
Use simple_html_dom way to crawl the entire page, and elemental analysis.
Codes are as follows
<?php header("Content-type: text/html; charset=utf-8"); require_once('simple_html_dom.php'); $index = 0; $total = 0; $html = file_get_html('http://www.quanmama.com/quan/1718911.html'); $html_bj_content = $html->find('table tbody', 0); echo $html_bj_content; foreach($html_bj_content->find('tr') as $item) { $title = $item->find('td', 0)->plaintext; $source = $item->find('td a', 0)->href; // echo $source; $total ++; if (false == stristr($source, "gsactivity.diditaxi.com.cn/gulfstream/activity/v2/giftpackage")){ continue; }else{ try { $channels = explode('g_channel=',$source); $data = array('title' => $title, 'source' => "https://gsactivity.diditaxi.com.cn/gulfstream/activity/v2/giftpackage/index?g_channel=".$channels[1], 'channel' => $channels[1] ); // var_dump($data); $diditrip = M('diditrip','tp_'); $isadd = $diditrip ->add($data); if ($isadd){ $index ++; } }the catch (\ Exception $ E ) { // $ RES = Array ( "code" => "error", "Message" => "Error Database"); } } } // $ this-> Success ( 'sync'. . $ total 'article, success' $ index "bar", 'index');.. ?>