Search
 
SCRIPT & CODE EXAMPLE
 

PYTHON

web scraping with selenium

from bs4 import BeautifulSoup
from selenium import webdriver
 
option = webdriver.ChromeOptions()
# I use the following options as my machine is a window subsystem linux. 
# I recommend to use the headless option at least, out of the 3
option.add_argument('--headless')
option.add_argument('--no-sandbox')
option.add_argument('--disable-dev-sh-usage')
# Replace YOUR-PATH-TO-CHROMEDRIVER with your chromedriver location
driver = webdriver.Chrome('YOUR-PATH-TO-CHROMEDRIVER', options=option)
 
driver.get('https://www.imdb.com/chart/top/') # Getting page HTML through request
soup = BeautifulSoup(driver.page_source, 'html.parser') # Parsing content using beautifulsoup. Notice driver.page_source instead of page.content
 
links = soup.select("table tbody tr td.titleColumn a") # Selecting all of the anchors with titles
first10 = links[:10] # Keep only the first 10 anchors
for anchor in first10:
    print(anchor.text) # Display the innerText of each anchor
Comment

PREVIOUS NEXT
Code Example
Python :: round python print 
Python :: matplotlib window size 
Python :: python basic data types 
Python :: python code to convert csv to xml 
Python :: django get admin url 
Python :: include app in django project 
Python :: list all files in a folder 
Python :: Excel file format cannot be determined, you must specify an engine manually 
Python :: convert time python 
Python :: python list operation 
Python :: aws python sdk 
Python :: pyspark on colab 
Python :: for loop in python 
Python :: if loop python 
Python :: gamma distribution python normalized 
Python :: django 3.2 compatible to python 3.10? 
Python :: google youtuve api 
Python :: can we use else without if in python 
Python :: how to get list size python 
Python :: python - 
Python :: immutability in python 
Python :: django for beginners 
Python :: python list extend 
Python :: python nested object to dict 
Python :: whitespace delimiter python 
Python :: invalid literal for int() with base 10 in python 
Python :: json diff python 
Python :: for loop in django template css 
Python :: how to use variable from another function in python 
Python :: sys.argv python example 
ADD CONTENT
Topic
Content
Source link
Name
5+7 =