Skip to content
Advertisement

Extracting enumeration values from an XML Schema with Python

From an XML Schema using xmlschema package, I extracted an XsdEnumerationFacets like the one below

XsdEnumerationFacets(['OP1', 'OP2', 'OP3', 'OP3', 'OP4', ...])

How can I extract the possible values from it? (‘OP1’, ‘OP2’, ‘OP3’, ‘OP3’, ‘OP4’ and so on in this case).

I had one idea to convert it into string (str(enum)), and parse it, but if it’s long, the last elements are not included.

(I have xmlschema==1.9.2 and Python 3.9.)

Example:

schema.xsd is

<?xml version="1.0"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"
  targetNamespace="http://www.java2s.com" xmlns="http://www.java2s.com"
  elementFormDefault="qualified">

    <xs:element name = "Tshirt">
       <xs:complexType>
          <xs:sequence>
             <xs:element name = "Color" type = "clothesColorType" />
             <xs:element name = "Size" type = "clothesSizeType" />
          </xs:sequence>
       </xs:complexType>
    </xs:element>

    <xs:simpleType name="clothesSizeType">
       <xs:restriction base="xs:string">
          <xs:enumeration value="S" />
          <xs:enumeration value="M" />
          <xs:enumeration value="L" />
          <xs:enumeration value="XL" />
       </xs:restriction>
    </xs:simpleType>

    <xs:simpleType name="clothesColorType">
       <xs:restriction base="xs:string">
          <xs:enumeration value="Black" />
          <xs:enumeration value="White" />
          <xs:enumeration value="Green" />
          <xs:enumeration value="Blue" />
       </xs:restriction>
    </xs:simpleType>
</xs:schema>

My code:

import xmlschema

schema = xmlschema.XMLSchema("schema.xsd")
tshirt = schema.elements["Tshirt"]

enumerate_values = {}
for c in tshirt.type.content:
    for comp in c.type.iter_components():
        if isinstance(comp, xmlschema.validators.XsdEnumerationFacets):
            enumerate_values[c.name.split("}")[1]] = str(comp)

print(enumerate_values)

That creates me the dictionary:

{'Color': "XsdEnumerationFacets(['Black', 'White', 'Green', 'Blue'])", 'Size': "XsdEnumerationFacets(['S', 'M', 'L', 'XL'])"}

Instead of "XsdEnumerationFacets(['Black', 'White', 'Green', 'Blue'])" as a value, I would like to have ['Black', 'White', 'Green', 'Blue']. And I don’t want to parse this string. As I mentioned for longer value list the last elements are substituted by ellipses (...), so parsing the string will give me a false or partial result.

Advertisement

Answer

    import xmlschema
    
    schema = xmlschema.XMLSchema("schema.xsd")
    tshirt = schema.elements["Tshirt"]
    
    enumerate_values = {}
    for c in tshirt.type.content:
        for comp in c.type.iter_components():
            if isinstance(comp, xmlschema.validators.XsdEnumerationFacets):
                enumerate_values[c.local_name] = [x.get("value") for x in comp]
    
    print(enumerate_values)

returns: {'Color': ['Black', 'White', 'Green', 'Blue'], 'Size': ['S', 'M', 'L', 'XL']}

User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement