I need to extract the value of an attribute in an XML document using Python.
For example, If I have an XML document like this:
JavaScript
x
5
1
<xml>
2
<child type = "smallHuman"/>
3
<adult type = "largeHuman"/>
4
</xml>
5
How would I be able get the text ‘smallHuman’ or ‘largeHuman’ to store in a variable?
Edit: I’m very new to Python and may require a lot of assistance.
This is what I’ve tried so far:
JavaScript
1
17
17
1
#! /usr/bin/python
2
3
import xml.etree.ElementTree as ET
4
5
6
def walkTree(node):
7
print node.tag
8
print node.keys()
9
print node.attributes[]
10
for cn in list(node):
11
walkTree(cn)
12
13
treeOne = ET.parse('tm1.xml')
14
treeTwo = ET.parse('tm3.xml')
15
16
walkTree(treeOne.getroot())
17
Due to the way this script will be used, I cannot hard-code the XML into the .py file.
Advertisement
Answer
Using ElementTree you can use find method & attrib .
Example:
JavaScript
1
12
12
1
import xml.etree.ElementTree as ET
2
3
z = """<xml>
4
<child type = "smallHuman"/>
5
<adult type = "largeHuman"/>
6
</xml>"""
7
8
9
treeOne = ET.fromstring(z)
10
print treeOne.find('./child').attrib['type']
11
print treeOne.find('./adult').attrib['type']
12
Output:
JavaScript
1
3
1
smallHuman
2
largeHuman
3