QueryList Collector ?? ???
/ Request 網(wǎng)絡(luò)操作擴(kuò)展
Request 網(wǎng)絡(luò)操作擴(kuò)展
Request 網(wǎng)絡(luò)操作擴(kuò)展
Request擴(kuò)展,可以實(shí)現(xiàn)如攜帶cookie、偽造來路、偽造瀏覽器等任意復(fù)雜的網(wǎng)絡(luò)請(qǐng)求
安裝:
composer require jaeger/querylist-ext-request
GIT地址:
https://github.com/jae-jae/QueryList-Ext-Request.git
依賴(通過Composer安裝的請(qǐng)忽略)
Request擴(kuò)展依賴Http
類,Git地址為:https://github.com/jae-jae/Http.git
手動(dòng)安裝插件教程:http://doc.querylist.cc/site/index/doc/7
用法一
$ql = QueryList::run('Request',[ 'http' => [ 'target' => '采集的目標(biāo)頁面', 'referrer' => '來源地址', 'method' => '請(qǐng)求方式,GET、POST等', 'params' => ['提交的參數(shù)'=>'參數(shù)值','key'=>'value'], //等等其它http相關(guān)參數(shù),具體可查看Http類源碼 ], 'callback' => function($html,$args){ //處理html的回調(diào)方法 return $html; }, 'args' => '傳給回調(diào)函數(shù)的參數(shù)' ]); $data = $ql->setQuery(...)->data;
用法二
$ql = QueryList::run('Request',[ 'target' => '采集的目標(biāo)頁面', 'referrer' => '來源地址', 'method' => '請(qǐng)求方式,GET、POST等', 'params' => ['提交的參數(shù)'=>'參數(shù)值','key'=>'value'], //等等其它http相關(guān)參數(shù),具體可查看Http類源碼 ]); $data = $ql->setQuery(...)->data;
返回值為設(shè)置好了html屬性的QueryList對(duì)象,然后應(yīng)該調(diào)用QueryList的setQuery方法設(shè)置采集規(guī)則。
//HTTP操作擴(kuò)展 $urls = QueryList::run('Request',[ 'target' => 'http://cms.querylist.cc/news/list_2.html', 'referrer'=>'http://cms.querylist.cc', 'method' => 'GET', 'params' => ['var1' => 'testvalue', 'var2' => 'somevalue'], 'user_agent'=>'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.8; rv:21.0) Gecko/20100101 Firefox/21.0', 'cookiePath' => './cookie.txt', 'timeout' =>'30' ])->setQuery(['link' => ['h2>a','href','',function($content){ //利用回調(diào)函數(shù)補(bǔ)全相對(duì)鏈接 $baseUrl = 'http://cms.querylist.cc'; return $baseUrl.$content; }]],'.cate_list li')->getData(function($item){ return $item['link']; });