I am trying to practice my python scraping for websites but am having trouble narrowing it down to a reasonable size without python not recognizing what I am asking for. For example, here is my code:
import bs4
import requests
url = requests.get('https://ballotpedia.org/Alabama_Supreme_Court')
soup = bs4.BeautifulSoup(url.text, 'html.parser')
y = soup.find('table')
print(y)
I am trying to scrape the names of the judges of the Alabama State Supreme Court, but with this code, I get far too much information. I have tried things such as (in row 6)
y = soup.find('table',{'class':'wikitable sortable'})`
but I get a message saying that the search finds no results.
Here is an image of the inspection of the webpage. I am aiming to get the thead to work in my code but am failing!
How can I specify to python that I want only the names of the judges?
Thank you very much!
Simply, I will do it like this way.
import pandas as pd
df = pd.read_html("https://ballotpedia.org/Alabama_Supreme_Court")[2]["Judge"]
print(df.to_list())
Output:
['Brad Mendheim', 'Kelli Wise', 'Michael Bolin', 'William Sellers', 'Sarah Stewart', 'Greg Shaw', 'Tommy Bryan', 'Jay Mitchell', 'Tom
Parker']
Now Moving back to the original
issue
to solve it as I personally love to fix the real issue without navigating to alternative solutions.
there's difference between find
which will return only the first element
but find_all
will return a list
of elements
. Check the Documentation.
import directly from bs4 import BeautifulSoup
instead of import bs4
as it's the The DRY Principle of Python.
Leave bs4
to handle the content as it's one of it's tasks in the back-ground. so instead of r.text
use r.content
Now, we will deep into the HTML
to select it:
from bs4 import BeautifulSoup
import requests
r = requests.get("https://ballotpedia.org/Alabama_Supreme_Court")
soup = BeautifulSoup(r.content, 'html.parser')
print([item.text for item in soup.select(
"table.wikitable.sortable.jquery-tablesorter a")])
Now, you have to read about CSS-Selection
Output:
['Brad Mendheim', 'Kelli Wise', 'Michael Bolin', 'William Sellers', 'Sarah Stewart', 'Greg Shaw', 'Tommy Bryan', 'Jay Mitchell', 'Tom Parker']