Skip to content
Advertisement

Tag: beautifulsoup

Walmart Price Scraping with Python 3

I am very new to this concept, but I am trying to learn how to use python to manipulate HTML data. I wrote a python (ver. 3.4.1) script which fetches the URL and returns some information, which I parse using BeautifulSoup (ver. 4). In this example, I am attempting to obtain the price of the Xbox One. I chose this

Beautifulsoup sibling structure with br tags

I’m trying to parse a HTML document using the BeautifulSoup Python library, but the structure is getting distorted by <br> tags. Let me just give you an example. Input HTML: HTML that BeautifulSoup interprets: In the source, the spans could be considered siblings. After parsing (using the default parser), the spans are suddenly no longer siblings, as the br tags

Extracting XML Attributes

I have an XML file with several thousand records in it in the form of: How can I convert this into a CSV or tab-delimited file? I know I can hard-code it in Python using re.compile() statements, but there has to be something easier, and more portable among diff XML file layouts. I’ve found a couple threads here about attribs,

Extracting an attribute value with beautifulsoup

I am trying to extract the content of a single “value” attribute in a specific “input” tag on a webpage. I use the following code: I get TypeError: list indices must be integers, not str Even though, from the Beautifulsoup documentation, I understand that strings should not be a problem here… but I am no specialist, and I may have

Advertisement