BeautifulSoup extract certain information after located text?

razodiac

New Member

Join Date: Aug 2021

Posts: 1
#1

BeautifulSoup extract certain information after located text?

Aug 8 '21, 07:29 AM

From html page:

Code:

<div class="peoples-info"> <ul> <li><strong>Gender:</strong> F</li> <li><strong>Birthdate:</strong> 00/00/2000</li> <li><strong>Family Phone:</strong> 000-000-0000</li> <li><strong>Personal Phone:</strong> 000-000-0000</li> </ul> </div> </div> <div>

I wanted to extract using BeautifulSoup's find_next function, but I could only do tables such as:

Code:

for gender in soup.find('td', text='gender:'): print(gender.find_next("td").text)

Which does not work with div when I replace "td" with "li"; also, title and number are in the same line with only the format changed a bit. Is there a way to extract only information such as phone numbers and birthdays without their titles ("000-000-0000")? Thanks!
Tags: None

SioSio

Contributor

Join Date: Dec 2019
Posts: 272

Aug 16 '21, 07:19 AM

This is a brute force way,

Code:

peoplesinfo = soup.find('div', class_='peoples-info')
for element in peoplesinfo.find_all("li"):
    el = element.find_all("strong")
    print(element.text.replace(el[0].text, ''))

BeautifulSoup extract certain information after located text?

BeautifulSoup extract certain information after located text?

Comment