I have a list of order details in a CSV, and want to join all items from the lines together on one order.
Example date is:
Order|Line|Item|Price
123456789|1|IK123456|199.99
654987321|1|MASGE12385|29.95
654987321|2|KLEAN458792|9.99
654987321|3|LP12489|1959.95
I want everything to be listed in an XML with the root as the Order Number, Child as the Line Number and Sub-Children as Item and Price.
I want the output to look like:
<Order number = "123456789">
<Line number = "1">
<Item>IK123456</Item>
<Price>199.99</Price>
</Line>
</Order>
<Order number = "654987321">
<Line = "1">
<Item>MASGE12385</Item>
<Price>29.95</Price>
</Line>
<Line = "2">
<Item>KLEAN458792</Item>
<Price>9.99</Price>
</Line>
<Line = "3">
<Item>LP12489</Item>
<Price>1959.95</Price>
</Line>
</Order>
Here is my code:
import csv
import xml.etree.ElementTree as ET
file = 'C:/github.txt'
with open (file, 'r') as f:
reader = csv.reader(f, delimiter = '|')
header = next(reader)
order_num = reader[0]
root = ET.Element("Order") #BUILD A ROOT FOR THE XML TREE
root.set('number', order_num) #ADD ATTRIBUTE
for row in reader: #ITERATE THROUGH EACH ROW AND POPULATE DATA IN BRANCHES OF XML TREE
line = ET.SubElement(root, 'line', number= reader[1])
item = ET.SubElement(line, 'item code')
item.text = reader[2]
price = ET.SubElement(line, 'price')
price.text = reader[3]
tree = ET.ElementTree(root)
tree.write('C:/github.xml', encoding = 'utf-8', xml_declaration = True)
(NOTE: I moved something and got an error, but not sure what happened)
Advertisement
Answer
During loop, consider keeping a tracker on Number
to conditionally decide to create an element and keep related underlying items together. Additionally, consider csv.DictReader
to iterate csv rows as a dictionary which takes first row headers as keys. Finally, use the built-in minidom
to pretty print output. Below will incorporate all XML items under the single <Orders>
root:
import csv
import xml.etree.ElementTree as ET
import xml.dom.minidom as mn
file = 'C:/github.txt'
curr_order = None
with open (file, 'r') as f:
reader = csv.DictReader(f, delimiter = '|')
# BUILD A ROOT FOR THE XML TREE
root = ET.Element("Orders")
# ITERATE THROUGH EACH ROW AS DICTIONARY
for d in reader:
# CONDITIONALLY BUILD ORDER ELEMENT
if curr_order != str(d['Order']):
orderElem = ET.SubElement(root, "Order")
curr_order = str(d['Order'])
# CREATE DESCENDANTS OF ORDER
orderElem.set('number', str(d['Order']))
line = ET.SubElement(orderElem, 'line', number = str(d['Line']))
ET.SubElement(line, 'item_code').text = str(d['Item'])
ET.SubElement(line, 'price').text = str(d['Price'])
# PRETTY PRINT OUTPUT
dom = mn.parseString(ET.tostring(root, encoding = 'utf-8'))
with open('C:/github.xml', 'wb') as f:
f.write(dom.toprettyxml(indent=" ", encoding = 'utf-8'))