Search
 
SCRIPT & CODE EXAMPLE
 

PYTHON

Scrapping tables in an HTML file with BeautifulSoup

import pandas as pd
import base64
import os
from bs4 import BeautifulSoup

file = 'file.html'
output = pd.DataFrame(columns=['Zone Entry', 'Date', 'Time', 'Zone Exit', 'Date', 'Time', 'Dwell', 'Fee'])
table_ids = ['ctl00_contentHolderBody_Table0', 'ctl00_contentHolderBody_Table1', 'ctl00_contentHolderBody_Table2']
all_rows = []
with open(file) as f:
    soup = BeautifulSoup(f, 'html.parser')
    for table in table_ids:
        table = soup.find(lambda tag: tag.name == 'table' and tag.has_attr('id') and tag['id'] == table) 
        rows = table.findAll(lambda tag: tag.name == 'td')
        for t in range(0, len(rows), 8):
            row = []
            for j in range(t,t+8):
                row.append(str(rows[j])[4:-5].strip())
            all_rows.append(row)

pd.DataFrame(all_rows).to_csv('output.csv')
Comment

PREVIOUS NEXT
Code Example
Python :: get output from transaction in brownie 
Python :: python argument parser default value 
Python :: parse int python 
Python :: beautifulsoup usage 
Python :: fill a column based on values in another column pandas 
Python :: download csv file from jupyter notebook 
Python :: distance of a point from a line python 
Python :: pyqt5 qtextedit change color of a specific line 
Python :: size array python 
Python :: python square number 
Python :: assosciate keys as list to values in python 
Python :: pandas filter rows by value 
Python :: plot data python 
Python :: pytorch dataloader 
Python :: python save variable to file pickle 
Python :: Python Making a New Directory 
Python :: spacy access vocabulary 
Python :: Comparison of two csv file and output with differences? 
Python :: insert list python 
Python :: how to make a button in python 
Python :: py2exe no console 
Python :: how to return a value from a function in python 
Python :: A Python Class Constructor 
Python :: change the side of the axis plt python 
Python :: len function in python 
Python :: install easygui conda 
Python :: if string in list python 
Python :: python gzip a file 
Python :: python pathlib 
Python :: install python windows powershell 
ADD CONTENT
Topic
Content
Source link
Name
2+7 =