I have a program that reads an XML document from a socket. I have the XML document stored in a string which I would like to convert directly to a Python dictionary, the same way it is done in Django’s simplejson
library.
Take as an example:
str ="<?xml version="1.0" ?><person><name>john</name><age>20</age></person" dic_xml = convert_to_dic(str)
Then dic_xml
would look like {'person' : { 'name' : 'john', 'age' : 20 } }
Advertisement
Answer
This is a great module that someone created. I’ve used it several times. http://code.activestate.com/recipes/410469-xml-as-dictionary/
Here is the code from the website just in case the link goes bad.
from xml.etree import cElementTree as ElementTree class XmlListConfig(list): def __init__(self, aList): for element in aList: if element: # treat like dict if len(element) == 1 or element[0].tag != element[1].tag: self.append(XmlDictConfig(element)) # treat like list elif element[0].tag == element[1].tag: self.append(XmlListConfig(element)) elif element.text: text = element.text.strip() if text: self.append(text) class XmlDictConfig(dict): ''' Example usage: >>> tree = ElementTree.parse('your_file.xml') >>> root = tree.getroot() >>> xmldict = XmlDictConfig(root) Or, if you want to use an XML string: >>> root = ElementTree.XML(xml_string) >>> xmldict = XmlDictConfig(root) And then use xmldict for what it is... a dict. ''' def __init__(self, parent_element): if parent_element.items(): self.update(dict(parent_element.items())) for element in parent_element: if element: # treat like dict - we assume that if the first two tags # in a series are different, then they are all different. if len(element) == 1 or element[0].tag != element[1].tag: aDict = XmlDictConfig(element) # treat like list - we assume that if the first two tags # in a series are the same, then the rest are the same. else: # here, we put the list in dictionary; the key is the # tag name the list elements all share in common, and # the value is the list itself aDict = {element[0].tag: XmlListConfig(element)} # if the tag has attributes, add those to the dict if element.items(): aDict.update(dict(element.items())) self.update({element.tag: aDict}) # this assumes that if you've got an attribute in a tag, # you won't be having any text. This may or may not be a # good idea -- time will tell. It works for the way we are # currently doing XML configuration files... elif element.items(): self.update({element.tag: dict(element.items())}) # finally, if there are no child tags and no attributes, extract # the text else: self.update({element.tag: element.text})
Example usage:
tree = ElementTree.parse('your_file.xml') root = tree.getroot() xmldict = XmlDictConfig(root)
//Or, if you want to use an XML string:
root = ElementTree.XML(xml_string) xmldict = XmlDictConfig(root)