Skip to content
Advertisement

converting a key value text file into a CSV file

I have a text file that needs to be converted into CSV file using pandas. A piece of it is presented in the following:

time 00:15 min
    cod,10,1=0,2=2,3=2,4=1,5=6,6=4,7=2,8=7,9=1,10=9,11=7
    cod,18,1=27,2=18,3=19,4=20,5=47,6=2,7=2,8=0,9=33,10=61,11=13,12=2,13=3,14=0,15=0

Rows are cod,10, and cod,18 and the columns are 1, 2, 3,…, 15. Any idea? Regards, Ali

Advertisement

Answer

I use pandas to deal with the conversion, but vanilla Python to deal with some of aspects of the data, I hope that is alright.

One issue we need to deal with is the fact that there are a different number of columns per row. So I just put NaN in columns that are missing for a row. For instance, row 1 is shorter than row 2, so the missing columns in row 1 are given values as “NaN”.

Here is my idea:

import pandas as pd

lines = []
with open('/path/to/test.txt', 'r') as infile:
    for line in infile:
        if "," not in line:
            continue
        else:
            lines.append(line.strip().split(","))

row_names = []
column_data = {}

max_length = max(*[len(line) for line in lines])

for line in lines:
    while(len(line) < max_length):
        line.append(f'{len(line)-1}=NaN')

for line in lines:
    row_names.append(" ".join(line[:2]))
    for info in line[2:]:
        (k,v) = info.split("=")
        if k in column_data:
            column_data[k].append(v)
        else:
            column_data[k] = [v]

df = pd.DataFrame(column_data)
df.index = row_names
print(df)

df.to_csv('/path/to/test.csv')

Output (the printed DataFrame):

         1   2   3   4   5  6  7  8   9  10  11   12   13   14   15
cod 10   0   2   2   1   6  4  2  7   1   9   7  NaN  NaN  NaN  NaN
cod 18  27  18  19  20  47  2  2  0  33  61  13    2    3    0    0

CSV File Output:

,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15
cod 10,0,2,2,1,6,4,2,7,1,9,7,NaN,NaN,NaN,NaN
cod 18,27,18,19,20,47,2,2,0,33,61,13,2,3,0,0
User contributions licensed under: CC BY-SA
5 People found this is helpful
Advertisement