my XML file is my code is I want to get value of element with tagname ‘string name=”ID”‘ but the error comes if i replace to the output comes “nCGhwaZNpy6” since its the first element of that list but second element is “02.11.2013 Scott Mobile” which also get sa…

using “getElementsByTagName” to get ta…

my XML file is

<list>
  <ProfileDefinition>
    <string name="ID">nCGhwaZNpy6</string>
    <string name="name">02.11.2013 Scott Mobile</string>
    <decimal name="AccountID">10954</decimal>
    <decimal name="TimeZoneID">-600</decimal>
  </ProfileDefinition><ProfileDefinition>
    <string name="ID">9JsG57bRUu6</string>
    <string name="name">Huggies US-EN & CA-EN Test Town Responsive - Prod</string>
    <decimal name="AccountID">10954</decimal>
    <decimal name="TimeZoneID">-600</decimal>
  </ProfileDefinition><ProfileDefinition>
    <string name="ID">I3CJQ4gDkK6</string>
    <string name="name">Huggies US-EN Brand Desktop - Prod</string>
    <decimal name="AccountID">10954</decimal>
    <decimal name="TimeZoneID">-600</decimal></ProfileDefinition>

my code is

import urllib2

theurl = 'https://ws.webtrends.com/v2/ReportService/profiles/?format=xml'




pagehandle = urllib2.urlopen(theurl)



##########################################################################

from xml.dom.minidom import parseString

file = pagehandle


data = file.read()

file.close()

dom = parseString(data)

xmlTag = dom.getElementsByTagName('string name="ID"')[0].toxml()

xmlData=xmlTag.replace('<string name="ID">','').replace('</string>','')

print xmlTag

print xmlData

I want to get value of element with tagname ‘string name=”ID”‘

but the error comes

Traceback (most recent call last):
  File "C:UsersVaibhavDesktopWebtrendstest.py", line 43, in <module>
    xmlTag = dom.getElementsByTagName('string name="ID"')[0].toxml()
IndexError: list index out of range

if i replace

dom.getElementsByTagName('string name="ID"')[0].toxml()

dom.getElementsByTagName('string')[0].toxml()

the output comes

“nCGhwaZNpy6”

since its the first element of that list but second element is

“02.11.2013 Scott Mobile”

which also get saved in list which i don’t want

however there are two string tag with name=”ID” and name=”name” how to access the string tag with name=”ID” only

Answer

string name="ID" is not tag name. Only string is tag name.

You have to compare name attribute value for each string tag.

....
dom = parseString(data)
for s in dom.getElementsByTagName('string'):
    if s.getAttribute('name') == 'ID':
        print s.childNodes[0].data

I recommed you to use lxml or BeautifulSoup.

Following is equivalent code using lxml.

import lxml.html
dom = lxml.html.fromstring(data)
for s in dom.cssselect('string[name=ID]'):
    print s.text

using “getElementsByTagName” to get tag in python

Advertisement

Answer