Skip to content
Advertisement

Skipping variable number of C-style comment lines when using pandas read_table

The pandas read_table() function enables us to read *.tab file and the parameter skiprow provides flexible ways to retrieve the data. However, I’m in trouble when I need to read *.tab file in a loop but the number of the rows need to skip is random. For example, the contents need to skip are started with /* and ended with */ , such as:

/*
... 
The number of rows need to skip is random
...
*/

So how do I find the line of the */ and then use the parameter skiprow?

Advertisement

Answer

Consume rows until the current row starts with '*/':

with open('data.txt') as fp:
    for row in fp:
        if row.startswith('*/'):
            df = pd.read_table(fp)
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement