Skip to content
Advertisement

Why does read_csv skiprows value need to be lower than it should be in this case?

I have a log file (Text.TXT in this case):

JavaScript

To read in this log file into pandas and ignore all the header info I would use skiprows up to line 16 like so:

JavaScript

But this produces EmptyDataError as it is skipping past where the data is starting. To make this work I’ve had to use it on line 11:

JavaScript

My question is if the data doesn’t start until row 17, in this case, why do I need to request a skiprows up to row 11?

Advertisement

Answer

One work around is to use comment parameter of pd.read_csv

JavaScript

Or

JavaScript

From docs under header parameter:

Note that this parameter ignores commented lines and empty lines if skip_blank_lines=True, so header=0 denotes the first line of data rather than the first line of the file.

Not sure about skiprows‘s weird behaviour here.

User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement