Search
 
SCRIPT & CODE EXAMPLE
 

PYTHON

web crawler using python

import requests
import lxml
from bs4
import BeautifulSoup
url = "https://www.rottentomatoes.com/top/bestofrt/"
headers = {
  'User-Agent': 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/63.0.3239.132 Safari/537.36 QIHU 360SE'
}
f = requests.get(url, headers = headers)
movies_lst = []
soup = BeautifulSoup(f.content, 'lxml')
movies = soup.find('table', {
    'class': 'table'
  })
  .find_all('a')
num = 0
for anchor in movies:
  urls = 'https://www.rottentomatoes.com' + anchor['href']
movies_lst.append(urls)
num += 1
movie_url = urls
movie_f = requests.get(movie_url, headers = headers)
movie_soup = BeautifulSoup(movie_f.content, 'lxml')
movie_content = movie_soup.find('div', {
  'class': 'movie_synopsis clamp clamp-6 js-clamp'
})
print(num, urls, '
', 'Movie:' + anchor.string.strip())
print('Movie info:' + movie_content.string.strip())
Comment

python web crawler

import scrapy

class BlogSpider(scrapy.Spider):
    name = 'blogspider'
    start_urls = ['https://blog.scrapinghub.com']

    def parse(self, response):
        for title in response.css('.post-header>h2'):
            yield {'title': title.css('a ::text').get()}

        for next_page in response.css('a.next-posts-link'):
            yield response.follow(next_page, self.parse)
Comment

PREVIOUS NEXT
Code Example
Python :: pandas write to excel 
Python :: rotate 90 degrees clockwise counter python 
Python :: Python Requests Library Put Method 
Python :: make a window tkinter 
Python :: tqdm progress bar python 
Python :: timestamp to date time till milliseconds python 
Python :: python webbrowser close tab 
Python :: how to open application using python 
Python :: python group by multiple aggregates 
Python :: python string to int 
Python :: python checking if something is equal to NaN 
Python :: python talib install windows 
Python :: dataframe move row up one 
Python :: sort series in ascending order 
Python :: python read entire file 
Python :: Clear All the Chat in Discord Channel With Bot Python COde 
Python :: pip install streamlit 
Python :: print class python 
Python :: how to disconnect wifi using python 
Python :: create dictionary from keys and values python 
Python :: python get last element of list 
Python :: how to get images on flask page 
Python :: Sorting Dataframes by Column Python Pandas 
Python :: knowing the sum null values in a specific row in pandas dataframe 
Python :: readlines from file python 
Python :: biggest of 3 numbers in python 
Python :: beautiful soup get class name 
Python :: python check if number is integer or float 
Python :: convert list to nd array 
Python :: plot second y axis matplotlib 
ADD CONTENT
Topic
Content
Source link
Name
6+9 =