Skip to content
Advertisement

Get text inside a span html beautifulSoup

I have this structure:

<span class="_1p7iugi">
<span class="_krjbj">Price:</span>$39</span>

and I want to get only the $39, but when I do this code:

def getListingPrice2(listing):
    return listing.find("span", {"class":"_1p7iugi"}).text

It returns me:

Price: $39

How can I get only the part I want?

Advertisement

Answer

Interesting question

from bs4 import BeautifulSoup
mainSoup = BeautifulSoup("""
<html>
<span class="_1p7iugi">
<span class="_krjbj">Price:</span>$39</span>
</html>
""")

external_span = mainSoup.find('span')
print("1 HTML:", external_span)
print("1 TEXT:", external_span.text.strip())

unwanted = external_span.find('span')
unwanted.extract()
print("2 HTML:", external_span)
print("2 TEXT:", external_span.text.strip())

will get you

1 HTML: <span class="_1p7iugi">
<span class="_krjbj">Price:</span>$39</span>
1 TEXT: Price:$39
2 HTML: <span class="_1p7iugi">
$39</span>
2 TEXT: $39

so

def getListingPrice2(listing):
    outer = listing.find("span", {"class":"_1p7iugi"})
    unwanted = outer.find('span')
    unwanted.extract()
    return outer.text.strip()

will get you

$39

User contributions licensed under: CC BY-SA
6 People found this is helpful
Advertisement