I am using lxml to objectify xml string with dashes in the tags.
For example:
JavaScript
x
9
1
from lxml import objectify
2
xml_string = """<root>
3
<foo-foo name="example" foo-description="description">
4
<bar doc-name="name" />
5
<test tag="test" />
6
</foo-foo>
7
</root>"""
8
obj = objectify.fromstring(xml_string)
9
After this step, the elements’ names come with dashes.
I can’t access foo-foo
due to dashes in the name.
How can I remove dashes from tags name as well as from attribute names?
Advertisement
Answer
It’s hacky, but you could do something like this to transform the -
in element names to a _
:
JavaScript
1
17
17
1
from lxml import etree
2
from lxml import objectify
3
4
xml_string = """<root>
5
<foo-foo name="example" foo-description="description">
6
<bar doc-name="name" />
7
<test tag="test" />
8
</foo-foo>
9
</root>"""
10
11
doc = etree.fromstring(xml_string)
12
for tag in doc.iter():
13
if '-' in tag.tag:
14
tag.tag = tag.tag.replace('-', '_')
15
16
obj = objectify.fromstring(etree.tostring(doc))
17
In particular, I think there is probably a better way to go from the parsed XML document in doc
to the objectified version without dumping and reparsing the XML, but this is the best I could come up with on short notice.