亚洲国产日韩欧美一区二区三区,精品亚洲国产成人av在线,国产99视频精品免视看7,99国产精品久久久久久久成人热,欧美日韩亚洲国产综合乱

Ruby:無(wú)法抓取百度搜索結(jié)果
PHP中文網(wǎng)
PHP中文網(wǎng) 2017-04-22 08:55:58
0
3
742

URL:http://www.baidu.com/s?wd=site:www.cnblogs.com
代碼:

def get_html(url)
    uri = URI(url)
    p resp = Net::HTTP.get(uri)
end

而獲取到的結(jié)果是百度首頁(yè)的源碼,并不是搜索site:www.cnblogs.com的結(jié)果


不知道,Ruby中有關(guān)于網(wǎng)絡(luò)編程方面的好書(shū)籍沒(méi)?
剛接觸ruby,很多東西不知道從何找(目前都是到官網(wǎng)看文檔)。


使用PHP簡(jiǎn)單實(shí)現(xiàn)了下:

<?php
set_time_limit(0);
function _rand()
{
    $length = 26;
    $chars = "0123456789abcdefghijklmnopqrstuvwxyz";
    $max = strlen($chars) - 1;
    mt_srand((double)microtime() * 1000000);
    $string = '';
    for ($i = 0; $i < $length; $i++) {
        $string.= $chars[mt_rand(0, $max) ];
    }
    return $string;
}
$HTTP_SESSION = _rand();
$HTTP_SESSION;
$HTTP_Server = "www.baidu.com";
$HTTP_URL = "/s?wd=site:www.cnblogs.com";
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, "http://" . $HTTP_Server . $HTTP_URL);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, true);
curl_setopt($ch, CURLOPT_USERAGENT, "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)");
$res = curl_exec($ch);
curl_close($ch);
print_r($res);
PHP中文網(wǎng)
PHP中文網(wǎng)

認(rèn)證0級(jí)講師

reply all(3)
小葫蘆

No matter what language code you use to crawl, you cannot capture Baidu content so easily.
Baidu is not the same Baidu it used to be. Without various cookie authentications, you can't even catch it. You'd better do some research to see if there is an API. Baidu's front-end code is full of twists and turns, just to prevent you from being caught.

迷茫

http://www.baidu.com/s?wd=www.cnblogs.com&rsv_bp=0&ch=&tn=19045005_5_pg&bar=&rsv_spt=3&ie=utf-8&rsv_n=2&rsv_sug3=1&rsv_sug4=57&rsv_sug2=0&inputT=635
Postmaster, you can only get back by throwing out such a large amount, right?

Ty80

Owner, do you want to grab the POI collected by Baidu?

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template