Skip to content
Advertisement

How to read specific data and write to csv file

I have data in xml file and I am reading 3 columns : price , name , calories

xml data

<?xml version='1.0' encoding='utf-8'?>
<data>
  <row>
    <index>0</index>
    <price>$5.95</price>
    <name>Belgian Waffles</name>
    <desc>Two of our famous Belgian Waffles with plenty of real maple syrup</desc>
    <calories>650</calories>
  </row>
  <row>
    <index>1</index>
    <price>$7.95</price>
    <name>Strawberry Belgian Waffles</name>
    <desc>Light Belgian waffles covered with strawberries and whipped cream</desc>
    <calories>900</calories>
  </row>
  <row>
    <index>2</index>
    <price>$8.95</price>
    <name>Berry-Berry Belgian Waffles</name>
    <desc>Light Belgian waffles covered with an assortment of fresh berries and whipped cream</desc>
    <calories>900</calories>
  </row>
  <row>
    <index>3</index>
    <price>$4.50</price>
    <name>French Toast</name>
    <desc>Thick slices made from our homemade sourdough bread</desc>
    <calories>600</calories>
  </row>
  <row>
    <index>4</index>
    <price>$6.95</price>
    <name>Homestyle Breakfast</name>
    <desc>Two eggs, bacon or sausage, toast, and our ever-popular hash browns</desc>
    <calories>950</calories>
  </row>
</data>

Code :

import xml.etree.ElementTree as ET

parse_xml = ET.parse('/content/sample_data/xyz.xml')
get_root_element = parse_xml.getroot()


for data in get_root_element.findall('row'):

  prc = data.find('price')
  nm  = data.find('name')
  cal = data.find('calories')
  temp = prc.text + ',' +  nm.text + ',' + cal.text
  print(temp)

The above code gives me data but need to store this data to csv file

How do I need to write logic for this. Is it possible to do with pandas / csv

Need to add my headers as well to that csv file

Headers : price , name , calories

Advertisement

Answer

Solution of @kiric8494 is good enough, you can stay with it. You can also implement it using csv.DictWriter which will be quite shorter:

import xml.etree.ElementTree as ET
from csv import DictWriter

parse_xml = ET.parse(r"/content/sample_data/xyz.xml")
root = parse_xml.getroot()
with open(r"/content/sample_data/abc.csv", "w", newline="") as f:
    writer = DictWriter(f, fieldnames=("price", "name", "calories"), extrasaction="ignore")
    writer.writeheader()
    writer.writerows({e.tag: e.text for e in row} for row in root)

Basically we set up DictWriter to ignore all fields except price, name and calories and then pass generator to .writerows() which construct dictionary of all child nodes of <row> where key is tag and value is text.

Advertisement