Converting scraped HTML table to Pandas dataframe

I have a problem with converting html table to pandas dataframe. I have used BeautifulSoup for scraping, and now I want to convert that table to pandas dataframe with read_html function. But for some reason I get an error.

import pandas as pd
from bs4 import BeautifulSoup
import requests


headers = {'User-Agent':'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.106 Safari/537.36'}

response = requests.get('https://en.wikipedia.org/wiki/Official_World_Golf_Ranking', headers = headers)
soup = BeautifulSoup(response.text, 'html.parser')


html_table = soup.find_all("table")[0]
print(html_table)
print(type(html_table))


df = pd.read_html(html_table)
print(df[0])

The error that I get is:

TypeError: 'NoneType' object is not callable

But html_table is <class 'bs4.element.Tag'>

Answer

Currently you are passing bs4 object to pandas, you should pass an html string.

Update the line with following code:

df = pd.read_html(str(html_table))

This should work for you!

Leave a Reply

Your email address will not be published. Required fields are marked *