Ovidiu Diaconu :
<div id="b_detalii_caracteristici" class="margin-boxes"> <h2 class="titlu-box special-caracteristici">Caracteristici</h2> <div class="row"> <div class="col-lg-6 col-md-6 col-sm-6"> <ul class="lista-tabelara"> <li>Nr. camere:<span>2</span></li> <li>Suprafaţă utilă:<span>44 mp</span></li> <li>Suprafaţă construită:<span>44 mp</span></li> <li>Compartimentare:<span>decomandat</span></li> <li>Confort:<span>lux</span></li> <li>Etaj:<span>Etaj 1 / 8</span></li> <li>Nr. bucătării:<span>1</span></li> <li>Nr. băi:<span>1</span></li> </ul> </div> <div class="col-lg-6 col-md-6 col-sm-6"> <ul class="lista-tabelara mobile-list"> <li>An construcţie:<span>2019</span></li> <li>Structură rezistenţă:<span>beton</span></li> <li>Tip imobil:<span>bloc de apartamente</span></li> <li>Regim înălţime:<span>P+8E</span></li> <li>Nr. balcoane:<span>1</span></li> </ul> </div> </div></div>
being given the above structure: I need to find a way to parse it and store in separate variables, each of the li values: i.e.
if string = "Nr. camere:":
var1 = 2
elsif string = "Suprafata utila:":
var2 = 44mp
and so on...
i have tried:
property_detail.find_all('div', id="b_detalii_caracteristici")[0].find_all('ul', class_='lista-tabelara')[0].find_all("li")[0]
and, this will give me next results I would need to parse in a for loop:
but, I'm stuck in here. Thanks for the support.
Ahmed Soliman :
There is a very useful method for that called contents which returns a list contains a tag’s children:
from bs4 import BeautifulSoup
html = '''<div id='b_detalii_caracteristici'>
<ul class="lista-tabelara">
<li>
"Nr. camere:"
<span>2</span>
</li>
<li>
"Suprafata utila:"
<span>44mp</span>
</li>
</ul>
</div>'''
soup = BeautifulSoup(html, 'html.parser')
lis = soup.select('#b_detalii_caracteristici ul.lista-tabelara li')
for li in lis:
li_content = li.contents
li_text = li_content[0].strip()
span_text = li_content[1].text
print('li_content ==> ',li_content)
print('li_text ==> ',li_text)
print('span_text ==>',span_text)
Output:
li_content ==> ['\n "Nr. camere:"\n ', <span>2</span>, '\n']
li_text ==> "Nr. camere:"
span_text ==> 2
li_content ==> ['\n "Suprafata utila:"\n ', <span>44mp</span>, '\n']
li_text ==> "Suprafata utila:"
span_text ==> 44mp
Guess you like
Origin http://43.154.161.224:23101/article/api/json?id=368701&siteId=1