track ip addresses, phone numbers, etc

Parse HTML, populate DataFrame and export to CSV

Published on 21 December 2018

This is a simple Python script that reads an HTML table and parses the cells. It then populates a pandas DataFrame object with this 2D array and exports the values into a CSV file.

For our example, we will use this HTML table from American Hospital Directory which has hospital statistics by state.


Download this Python script as or anything and run it. It works on either Python 2.7 or 3.7.

import re
import requests
import pandas as pd

# Constants
user_agent = 'Mozilla/5.0 (compatible; NoBot/1.1)'
url = ''
regex = re.compile('<td><a href="states/hospital_.+">(.+?)</a></td>s*
<td align="right">(.+?)</td>s*<td align="right">(.+?)</td>s*<td align="right">(.+?)</td>s*
<td align="right">(.+?)</td>s*<td align="right">(.+?)</td>')
csv_filename = 'ahd.csv'

# Vars
states = []
number_hospitals = []
staffed_beds = []
total_discharges = []
patient_days = []
gross_patient_revenue = []

# Get page
page = requests.get(url, headers={'user-agent': user_agent})
html = page.text

# Parse HTML
html = re.sub(r's{2,}', ' ', html)
trs = re.findall(r'<tr>(.+?)</tr>', str(html))
for tr in trs:
    tds =
    if tds:

dictionary = {
    'State': states,
    'Number Hospitals': number_hospitals,
    'Staffed Beds': staffed_beds,
    'Total Discharges': total_discharges,
    'Patient Days': patient_days,
    'Gross Patient Revenue': gross_patient_revenue
columns = dictionary.keys()

# Create dataframe
ahd = pd.DataFrame(dictionary, columns=columns)

You will need the Python modules requests and pandas to run this. Let's install them,

pip install requests
pip install pandas

Now that the dependencies are installed, run the script:


The output will be a CSV file named ahd.csv.

Please let me know if you run into any issues. Thanks for reading this post.

Created on 21 December 2018

Affiliate Disclosure: Some of the links to products on this blog are affiliate links. It simply means, at no additional cost to you, we’ll earn a commission if you click through and buy any product.