Narrowing down what I am scraping from a website using python

Anterthorp :

I am trying to practice my python scraping for websites but am having trouble narrowing it down to a reasonable size without python not recognizing what I am asking for. For example, here is my code:

import bs4
import requests

url = requests.get('https://ballotpedia.org/Alabama_Supreme_Court')
soup = bs4.BeautifulSoup(url.text, 'html.parser')
y = soup.find('table')
print(y)

I am trying to scrape the names of the judges of the Alabama State Supreme Court, but with this code, I get far too much information. I have tried things such as (in row 6)

y = soup.find('table',{'class':'wikitable sortable'})`

but I get a message saying that the search finds no results.

Here is an image of the inspection of the webpage. I am aiming to get the thead to work in my code but am failing!

How can I specify to python that I want only the names of the judges?

Thank you very much!

αԋɱҽԃ αмєяιcαη :

Simply, I will do it like this way.

import pandas as pd

df = pd.read_html("https://ballotpedia.org/Alabama_Supreme_Court")[2]["Judge"]

print(df.to_list())

Output:

['Brad Mendheim', 'Kelli Wise', 'Michael Bolin', 'William Sellers', 'Sarah Stewart', 'Greg Shaw', 'Tommy Bryan', 'Jay Mitchell', 'Tom 
Parker']

Now Moving back to the original issue to solve it as I personally love to fix the real issue without navigating to alternative solutions.

there's difference between find which will return only the first element but find_all will return a list of elements. Check the Documentation.

import directly from bs4 import BeautifulSoup instead of import bs4 as it's the The DRY Principle of Python.

Leave bs4 to handle the content as it's one of it's tasks in the back-ground. so instead of r.text use r.content

Now, we will deep into the HTML to select it:

from bs4 import BeautifulSoup
import requests

r = requests.get("https://ballotpedia.org/Alabama_Supreme_Court")
soup = BeautifulSoup(r.content, 'html.parser')


print([item.text for item in soup.select(
    "table.wikitable.sortable.jquery-tablesorter a")])

Now, you have to read about CSS-Selection

Output:

['Brad Mendheim', 'Kelli Wise', 'Michael Bolin', 'William Sellers', 'Sarah Stewart', 'Greg Shaw', 'Tommy Bryan', 'Jay Mitchell', 'Tom Parker']

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=361445&siteId=1