Skip to content
Advertisement

Reading a CSV from a particular line

I am writing a program working on weather station’s data, and this is the CSV I get from my station: enter image description here

The issue is that pandas has troubles opening it. First, I had an error message that I managed to bypass by writing:

JavaScript

Now the other issue is that the pandas file only displays the first 4 lines: enter image description here

The CSV can be downloaded at: https://mesowest.utah.edu/cgi-bin/droman/download_api2_handler.cgi?output=csv&product=&stn=PAYA&unit=0&daycalendar=1&hours=1&day1=05&month1=01&year1=2020&time=LOCAL&hour1=0&var_0=air_temp&var_8=precip_accum_one_hour.

How could I properly read the file ? What I would like is to divide it in 4 columns to have date, hour, precipitation, temperature. My final goal is to automatically download the data for a given date-window, and stack the variables into a giant array.

Advertisement

Answer

When the skiprows argument to pandas.read_csv is passed a list, according to the docs you are asking it to skip exactly the rows in that list, not a range of rows.

If you want to skip the first 8 rows as it appears just pass skiprows=8.

Update: I found the following worked best for this dataset:

JavaScript

This uses row 6 for the column names, skipping the row 7 which is giving some units. Using header=6 implicitly skips to row 7 as the start of the data.

Result:

JavaScript
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement