?? ? ????? ???? ????? ? ? ????? ?????? ?? ????? ??? "? ??" ?? ??? ???????
??? ??? ????. ??? ?? ??????? ??? ??? ???? ?? ??? ???? ????? ? ???????? ??? ? ????.
? ????? ? ?? ??? ???? ?? ???? ????? ??? ???? ??? ?????. ?? ????? ??? ??? ????.
????? ?? ??? ??? ???.
- ? ????? ?? Selenium? ?????.
- '? ??' ?? ????? ??????.
- ??, ??, ?? ?? ?? ???? ?????.
????!
1??: ????
???? ?? ?? ?? ??? ?????.
- Python ??: python.org?? ?? ? pip? ???? ?? Python ??? ?????? ?????.
- ?? ??: ? ???? ??, Python ?????, ??, BeautifulSoup, Selenium? ?? ????? ??? ?? ??
?? ?????:
- ??: HTTP ??? ??? ? ?????.
- BeautifulSoup: HTML ???? ?? ?????.
- Selenium: ????? ?? ??? ?? ??? ?? ??? ????????.
????? ?? ??? ???? ??? ?????? ??? ? ????.
pip install requests beautifulsoup4 selenium
Selenium? ???? ?? ????? ?? ? ????? ???? ???. ? ??????? Google Chrome? ChromeDriver? ?????. ??? Firefox? Edge? ?? ?? ??????? ??? ??? ?? ? ????.
? ???? ??
- ???? ??? ?????:
Google Chrome? ?? ??? > Chrome ??? 3? ???? Chrome ??? ?????.
ChromeDriver ????:
ChromeDriver ???? ???? ?????.
Chrome ??? ?? ???? ??? ???????.
??? ??? ChromeDriver? ?????:
????? ??? ???? /usr/local/bin(Mac/Linux) ?? C:WindowsSystem32(Windows)? ?? ????? ????.
?? ??
???? ?????? Python ?? scraper.py? ????? ?? ?? ??? ???? ?? ?? ???? ?????? ??????.
from selenium import webdriver driver = webdriver.Chrome() # Ensure ChromeDriver is installed and in PATH driver.get("https://www.scrapingcourse.com/button-click") print(driver.title) driver.quit()
????? ?? ??? ???? ? ?? ??? ??? ? ????.
pip install requests beautifulsoup4 selenium
? ??? ?? ?? ???? ???? ?????? ???? ??? ?? ?? ??? URL? ????.
Selenium? HTML? ???? ??? ??? ?????. ??? ?? ??? ????? -
from selenium import webdriver driver = webdriver.Chrome() # Ensure ChromeDriver is installed and in PATH driver.get("https://www.scrapingcourse.com/button-click") print(driver.title) driver.quit()
?? Selenium? ??? ??? ???? ?????. ?? ?? ??? ???? ??? ??? ?? ?? ??? ???? ???? ? ????.
2??: ???? ?? ??? ?? ??
? ?? ??? ??? HTML? ?? ???? ???? ???? ?? ???? ???? ????. ?? ??? ???? ???? ????? ??? ???? ???? ? ??? ???.
Python? ?? ?????? ???? GET ??? ?? ??? URL? HTML ???? ?????. ??? ??? ????.
python scraper.py
? ??? ?? 12? ??? ?? ???? ??? ?? HTML? ?????.
HTML? ?? ????? ?? ??? ?????, ??? ???? ???? ??? ??? ? ????.
3??: ? ?? ?? ??
??? ??? ?????? ? ?? ??? ? ?? ??? ?? ??? ????? '? ??' ??? ????? ???? ???? ???. ? ?? ???? JavaScript? ????? Selenium? ???? ?? ??? ????????.
??? ???? ?? ???? ???? ??? ????.
- "? ??" ?? ???(load-more-btn).
- ?? ????? ??? div(product-item).
? ?? ??? ???? ?? ??? ?? ? ???, ?? ??? ???? ? ? ?????? ?? ? ????.
Load More Button Challenge to Learn Web Scraping - ScrapingCourse.com
? ??? ????? ?? ???? ???? '? ??' ??? ???????. ?? ? ?? ?? ???? ??? ????? HTML? ?????.
? ??? ??? ??? Selenium? ????? ?? ??? ??? ???? ???? ??? ?????. ???? ????? ?? ? ????? ?? ??? ??? ??? ??? ??? ?????(GUI)? ????.
??? ?? ChromeOptions ??? ???? ?? WebDriver Chrome ???? ???? Selenium?? Chrome? ???? ??? ???? ? ????.
import requests # URL of the demo page with products url = "https://www.scrapingcourse.com/button-click" # Send a GET request to the URL response = requests.get(url) # Check if the request was successful if response.status_code == 200: html_content = response.text print(html_content) # Optional: Preview the HTML else: print(f"Failed to retrieve content: {response.status_code}")
? ??? ???? Selenium? ???? Chrome ????? ????? ? ?? Chrome ?? ???? ????. ?? ???? ???? ????? ??? ? GUI?? ???? ???? ?? ?? ???? ??? ??????.
?? ?? HTML ???? ???? ? ??? ?? ???? ????? ??? ?????.
4??: ?? ?? ??
? ????? BeautifulSoup? ???? HTML? ?? ???? ?? ??? ?????. ?? ?? ??, ??, ?? ? ? ??? ?? ????? ?????.
pip install requests beautifulsoup4 selenium
???? ??? ?? ??, ??? URL, ??, ?? ??? ??? ??? ?? ????? ???? ??? ?????.
from selenium import webdriver driver = webdriver.Chrome() # Ensure ChromeDriver is installed and in PATH driver.get("https://www.scrapingcourse.com/button-click") print(driver.title) driver.quit()
? ??? ?? HTML ???? ???? ???? ???? ?? ??? ?? ?? ??? ?? ? ??? ? ?? ????.
5??: ?? ??? CSV? ????
?? ??? ???? CSV ??? ???? ???? ??? ? ??????. Python? CSV ??? ?? ?????.
python scraper.py
? ??? ?? ?? ?? ????? ??? ? CSV ??? ?????.
??? ?? ?? ??? ??? ????.
Load More Button Challenge to Learn Web Scraping - ScrapingCourse.com
? ??? ??? ?? products.csv? ?????.
import requests # URL of the demo page with products url = "https://www.scrapingcourse.com/button-click" # Send a GET request to the URL response = requests.get(url) # Check if the request was successful if response.status_code == 200: html_content = response.text print(html_content) # Optional: Preview the HTML else: print(f"Failed to retrieve content: {response.status_code}")
6??: ?? ??? ?? ?? ??? ??
?? ?? ?? ?? 5? ??? ???? ?? ????? ?? ???(?: ?? ??, SKU ??)? ????? ??? ?????. ??? ?? ??? ???? ?? ??? ? ????.
from selenium import webdriver from selenium.webdriver.common.by import By import time # Set up the WebDriver (make sure you have the appropriate driver installed, e.g., ChromeDriver) driver = webdriver.Chrome() # Open the page driver.get("https://www.scrapingcourse.com/button-click") # Loop to click the "Load More" button until there are no more products while True: try: # Find the "Load more" button by its ID and click it load_more_button = driver.find_element(By.ID, "load-more-btn") load_more_button.click() # Wait for the content to load (adjust time as necessary) time.sleep(2) except Exception as e: # If no "Load More" button is found (end of products), break out of the loop print("No more products to load.") break # Get the updated page content after all products are loaded html_content = driver.page_source # Close the browser window driver.quit()
??? ?? ?? ??? ??? ????.
from selenium import webdriver from selenium.webdriver.common.by import By import time # instantiate a Chrome options object options = webdriver.ChromeOptions() # set the options to use Chrome in headless mode options.add_argument("--headless=new") # initialize an instance of the Chrome driver (browser) in headless mode driver = webdriver.Chrome(options=options) ...
? ??? ??? ?? ?????? ??? ?????. ?? ?? ?? ?? ?? 5? ??? ?? ????? ?? ?? ???? ?? BeautifulSoup? ???? ?? ?? ? SKU? ?????.
? ??? ??? ??? ????.
from bs4 import BeautifulSoup # Parse the page content with BeautifulSoup soup = BeautifulSoup(html_content, 'html.parser') # Extract product details products = [] # Find all product items in the grid product_items = soup.find_all('div', class_='product-item') for product in product_items: # Extract the product name name = product.find('span', class_='product-name').get_text(strip=True) # Extract the product price price = product.find('span', class_='product-price').get_text(strip=True) # Extract the product link link = product.find('a')['href'] # Extract the image URL image_url = product.find('img')['src'] # Create a dictionary with the product details products.append({ 'name': name, 'price': price, 'link': link, 'image_url': image_url }) # Print the extracted product details for product in products[:2]: print(f"Name: {product['name']}") print(f"Price: {product['price']}") print(f"Link: {product['link']}") print(f"Image URL: {product['image_url']}") print('-' * 30)
? ??? products.csv? ?????? ?? ??? ?? ??? ?? ???.
Name: Chaz Kangeroo Hoodie Price: Link: https://scrapingcourse.com/ecommerce/product/chaz-kangeroo-hoodie Image URL: https://scrapingcourse.com/ecommerce/wp-content/uploads/2024/03/mh01-gray_main.jpg ------------------------------ Name: Teton Pullover Hoodie Price: Link: https://scrapingcourse.com/ecommerce/product/teton-pullover-hoodie Image URL: https://scrapingcourse.com/ecommerce/wp-content/uploads/2024/03/mh02-black_main.jpg ------------------------------ …
??
?? ??? ?? "? ??" ??? ???? ???? ?????? ?? ??? ?? ? ??? Requests, Selenium ? BeautifulSoup? ?? ??? ???? ????? ??????.
? ??????? ?? ????? ?? ???? ?? ? ???? ??? ?? ???? ? ??? ???? ???? ???? ??? ???????.
???? ?? ?? ???? ?????.
? ??? ? ?? ??? ???? ?? ??? ??? ???: ??? ???? ?? ?????. ??? ??? PHP ??? ????? ?? ?? ??? ?????!

? AI ??

Undress AI Tool
??? ???? ??

Undresser.AI Undress
???? ?? ??? ??? ?? AI ?? ?

AI Clothes Remover
???? ?? ???? ??? AI ?????.

Clothoff.io
AI ? ???

Video Face Swap
??? ??? AI ?? ?? ??? ???? ?? ???? ??? ?? ????!

?? ??

??? ??

???++7.3.1
???? ?? ?? ?? ???

SublimeText3 ??? ??
??? ??, ???? ?? ????.

???? 13.0.1 ???
??? PHP ?? ?? ??

???? CS6
??? ? ?? ??

SublimeText3 Mac ??
? ??? ?? ?? ?????(SublimeText3)

???? Python ?? ?? ?????? ?? ????, "??? ?????, ?? ??"? ???? ??? ??? ??? ?? ??? ?????. 1. ???? ?? ? ??? ?? ?????. ?? ???? ?? ??? ???? ??? ? ? ????. ?? ??, Spoke () ?? ???? ??? ??? ?? ??? ?? ????? ?? ??? ??? ????. 2. ???? ?? ???? ??? ??? ?????? Draw () ???? ???? ????? ?? ???? ?? ??? ???? ??? ???? ?? ?? ?? ??? ????? ?? ?? ????? ?? ?????. 3. Python ?? ???? ???????. ?? ???? ??? ???? ?? ???? ??? ????? ??? ?? ???? ??? ???? ????. ??? ??? ??? ???? ? ??? "?? ??"??????. 4. ???? ? ???? ?? ??? ?????

???? __iter __ () ? __next __ () ???? ???? ?????. ???? ??? ? ??? ????, ?? ???? ?? ??? ??? ???? ?????. 1. ???? ?? () ?? ? ??? ??? ???? ? ?? ??? ?? ? ?? ???? ??? ????. 2. ???? ?? ??? ???? ??? ???? ???? ???? ???? ?? ???? ?????. 3. ???? ???? ?? ??? ?? ? ? ? ??? ?? ? ???????? ? ? ??? ?? ??? ??? ???? ?? ? ? ???? ??????. ?? : ??? ?? ???? ??? ???? ????. ???? ?? ?? ? ??? ?????? ???? ? ?? ?? ? ? ????.

API ??? ??? ??? ?? ??? ???? ???? ???? ????. 1. Apikey? ?? ??? ?? ????, ????? ?? ?? ?? URL ?? ??? ?????. 2. Basicauth? ?? ???? ??? Base64 ??? ??? ??? ??? ????? ?????. 3. OAUTH2? ?? Client_ID ? Client_Secret? ?? ??? ?? ?? ?? ??? BearEtroken? ???????. 4. ?? ??? ???? ?? ?? ?? ???? ????? ???? ?? ?? ? ????. ???, ??? ?? ??? ??? ???? ?? ??? ???? ???? ?? ?????.

Assert? ????? ???? ???? ?? ? ???? ??? ???? ??? ?? ?? ????. ??? ??? ??? ?? ??? ?????, ?? ?? ?? ??, ?? ?? ?? ?? ?? ?? ??? ????? ?? ?? ??? ?? ???? ??? ? ??? ??? ??? ??? ?? ???????. ?? ??? ???? ?? ?? ???? ?? ????? ??? ? ????.

????? ??? ? ??? ??? ?? ??? ???? ??? zip () ??? ???? ????.? ??? ?? ??? ???? ?? ??? ?? ????. ?? ??? ???? ?? ?? itertools.zip_longest ()? ???? ?? ?? ? ??? ?? ? ????. enumerate ()? ???? ??? ???? ?? ? ????. 1.zip ()? ???? ????? ?? ??? ??? ??? ?????. 2.zip_longest ()? ???? ?? ??? ?? ? ? ???? ?? ? ????. 3. Enumental (Zip ())? ??? ??? ????? ??? ???? ???? ?? ???? ?? ? ????.

inpython, iteratorsareobjectsthatlowloppingthroughcollections __ () ? __next __ ()

typehintsinpythonsolvetheproblemombiguityandpotentialbugsindynamicallytypedcodebyallowingdevelopscifyexpectiontypes. theyenhancereadability, enablearylybugdetection ? improvetoomingsupport.typehintsareaddedusingaColon (:) forvariblesAndAramete

Python? ???? ????? ???? API? ???? Fastapi? ?????. ?? ??? ?? ????? ?????? ??? ??? ??? ???? ?? ? ? ????. Fastapi ? Asgi Server Uvicorn? ?? ? ? ????? ??? ??? ? ????. ??? ??, ?? ?? ?? ? ???? ?????? API? ???? ?? ? ? ????. Fastapi? ??? HTTP ??? ???? ?? ?? ? Swaggerui ? Redoc Documentation Systems? ?????. ?? ??? ?? URL ?? ??? ?? ? ??? ??, ?? ?? ??? ???? ???? ?? ?? ??? ??? ? ????. Pydantic ??? ???? ??? ?? ???? ???? ????? ? ??? ? ? ????.
