Why does read_csv skiprows value need to be lower than it should be in this case?

Question

I have a log file (Text.TXT in this case): To read in this log file into pandas and ignore all the header info I would use skiprows up to line 16 like so: But this produces EmptyDataError as it is skipping past where the data is starting. To make this work I've had to use it on line 11: My

Accepted Answer

One work around is to use comment parameter of pd.read_csvfrom io import StringIOtext='''# 1: 5# 3: x# F: 5.# ID: 001# No.: 2# No.: 4# Time: 20191216T122109# Value: ";"# Time: 4# Time: ""# Time ms: ""# Date: ""# Time separator: "T"# J: 1000000# Silent: false# mode: trueTimestamp;T;ID;P16T122109957;0;6;0006'''df = pd.read_csv(StringIO(text),comment='#',sep=';')df      Timestamp  T  ID  P0  16T122109957  0   6  6Ordf = pd.read_csv(StringIO(text),header=0,comment='#',sep=';')From docs under header parameter:  Note that this parameter ignores commented lines and empty lines if skip_blank_lines=True, so header=0 denotes the first line of data rather than the first line of the file.Not sure about skiprows&#8216;s weird behaviour here.

Advertisement

Answer