亚洲国产日韩欧美一区二区三区,精品亚洲国产成人av在线,国产99视频精品免视看7,99国产精品久久久久久久成人热,欧美日韩亚洲国产综合乱

crawler - How to complete JavaScript function page turning in Python crawler?
typecho
typecho 2017-06-13 09:24:39
0
2
1569

When I crawled a web page, I noticed that its page turning was implemented by such a function. After turning the page, the page URL did not change:

<input class="buttonJump" name="goto2" onclick="dirGroupMblogToPage(document.getElementById('dirGroupMblogcp2').value)" type="button" value="Go"/>
     </input>
     
     
function dirGroupMblogToPage(currentPage){

    jQuery.post("dirGroupMblog.action", {"page.currentPage":currentPage,gid:MI.TalkBox.gid}, function(data){$("#talkMain").html(data);
        window.scrollTo(0, $css.getY(MI.talkList._body)-65);
    });

}

Written a function like this to try to achieve page turning:

def login_page(login_url, content_url, usr_name="******@126.com", passwd="******"):
    # 實(shí)現(xiàn)登錄, 返回Session對(duì)象和獲得的頁面
    post_data = {'r': 'on', 'u': usr_name, 'p': passwd}
    s = requests.Session()
    s.post(login_url, post_data)
    r = s.get(content_url)
    return s, r

def turn_page(s, next_page, content_url):
    post_url = "http://sns.icourses.cn/dirGroupMblog.action"
    post_headers = {"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36", 
               "X-Requested-With":"XMLHttpRequest"}
    post_data = {"page.currentPage": next_page, "gid": 2632}
    s.post(post_url, data=post_data, headers = post_headers)
    res = s.get(content_url)
    return res

But page turning failed after calling turn_page(). How should we solve this problem? Also, what kind of knowledge do we need to learn by ourselves to solve this kind of problem? Thank you!

typecho
typecho

Following the voice in heart.

reply all(2)
阿神
  • Recommended to use selenium

  • For example, if you need to click the next page button on the interface, or you need to enter the up, down, left, and right keys, the page can be turned, selenium webdriver can do it, and give a reference (I used to crawl the novels on Qidian Chinese website )

  • Selenium can interact with the page, click, double-click, enter, wait for the page to load (implicit wait, and explicit wait). . . .

from selenium import webdriver
# from selenium.webdriver.common.keys import Keys

#driver = webdriver.PhantomJS(executable_path="D:\phantomjs-2.1.1-windows\bin\phantomjs")
# 我的windows 已配置環(huán)境變量,不需指定 executable_path,使用 Chrome需要對(duì)應(yīng)的瀏覽器以及驅(qū)動(dòng)程序
driver = webdriver.Chrome()

# url 為你需要加載的頁面url
url = 'http://sns.icourses.cn/*****'

# 打開頁面
driver.get(url)

# 在你的例子中,是需要點(diǎn)擊 button ,通過class 屬性獲取到button,然后執(zhí)行單擊 .click()
# 如果需要準(zhǔn)確定位,可以自行搜索其他的 find_
driver.find_element_by_class_name("buttonJump").click()

# selenium webdriver 還有很多其它高級(jí)的用法,自行谷歌,你這個(gè)問題,搜索應(yīng)該是能得到答案的,
Ty80

There are several situations,
1. The page can be turned by sliding or clicking through the js effect;
2. The page can be turned by clicking on the hyperlink;

You can use the network analysis in Chrome's developer tools to get the result, whether it is an html page or a feedback json rendering.
json is easier to handle, just get the result directly. Ordinary html pages need to use regular matching to page breaks. Then put the link into the pool to be crawled.

/a/11...

Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template