Skip to content
Advertisement

Objectify xml string with dashes in tags and attributes names

I am using lxml to objectify xml string with dashes in the tags.

For example:

from lxml import objectify
xml_string = """<root>
                   <foo-foo name="example" foo-description="description">
                       <bar doc-name="name" />
                       <test tag="test" />
                    </foo-foo>
                </root>"""
obj = objectify.fromstring(xml_string)

After this step, the elements’ names come with dashes. I can’t access foo-foo due to dashes in the name.

How can I remove dashes from tags name as well as from attribute names?

Advertisement

Answer

It’s hacky, but you could do something like this to transform the - in element names to a _:

from lxml import etree
from lxml import objectify

xml_string = """<root>
                   <foo-foo name="example" foo-description="description">
                       <bar doc-name="name" />
                       <test tag="test" />
                    </foo-foo>
                </root>"""

doc = etree.fromstring(xml_string)
for tag in doc.iter():
    if '-' in tag.tag:
        tag.tag = tag.tag.replace('-', '_')

obj = objectify.fromstring(etree.tostring(doc))

In particular, I think there is probably a better way to go from the parsed XML document in doc to the objectified version without dumping and reparsing the XML, but this is the best I could come up with on short notice.

User contributions licensed under: CC BY-SA
7 People found this is helpful
Advertisement