A software I use outputs the results as text txt files in the following way Output Text File.
or like here for example:
------------------------------------ Joint 1 ------------------------------------ Time Parameter1 Parameter 2 16.000 1174 7.45 17.000 1174 7.87 18.000 1174 7.64 ------------------------------------ Joint 2 ------------------------------------ Time Parameter1 Parameter 2 16.000 1174 7.45 17.000 1154 7.87 18.000 1124 7.64 ------------------------------------ Joint 3 ------------------------------------ Time Parameter1 Parameter 2 16.000 1174 7.55 17.000 1174 7.67 18.000 1174 7.84
Now I want to analyse the results for each joint and dont know how to import the text file into pandas in a feasible way. Optimally I want something like this Wanted Format or a separate pandas dataframe or numpy array / list for each joint. Like here for example:
Joint 1 Joint 2 Joint 3 Time Parameter1 Parameter 2 Time Parameter1 Parameter 2 Time Parameter1 Parameter 2 16.000 1174 7.45 16.000 1174 7.45 16.000 1174 7.55 17.000 1174 7.87 17.000 1154 7.87 17.000 1174 7.67 18.000 1174 7.64 18.000 1124 7.64 18.000 1174 7.84
Would be great if someone can help with how to deal with the hyphen line breaks of the output text file.
Advertisement
Answer
With a slightly modified Output Text File
------------------------------------ Joint 1 ------------------------------------ Time Parameter1 Parameter2 16.000 1174 7.45 17.000 1174 7.87 18.000 1174 7.64 ...
pasted into a file named output.txt
this
import pandas as pd from io import StringIO dfs = [] with open('ouput.txt', 'r') as file: next(file) for line in file: joint = line.strip() next(file) df_str = '' for line in file: if line.startswith('--'): break df_str += line df = pd.read_csv(StringIO(df_str), delim_whitespace=True) df.columns = pd.MultiIndex.from_product([[joint], df.columns]) dfs.append(df) df = pd.concat(dfs, axis=1)
produces the following output (print(df)
)
Joint 1 ... Joint 3 Time Parameter1 Parameter2 ... Time Parameter1 Parameter2 0 16.0 1174 7.45 ... 16.0 1174 7.55 1 17.0 1174 7.87 ... 17.0 1174 7.67 2 18.0 1174 7.64 ... 18.0 1174 7.84 [3 rows x 9 columns]
at least for me. (This stuff is a bit fickle, the exact formatting of the file is rather important.) The list dfs
contains one DataFrames for each “joint”, if you rather want to deal with them individually.
If you are more interested in a vertical arrangement with Time
as part of the index then the replacement
... df = pd.read_csv(StringIO(df_str), delim_whitespace=True, index_col=0) df.index = pd.MultiIndex.from_product([[joint], df.index]) ... df = pd.concat(dfs, axis=0)
produces
Parameter1 Parameter2 Time Joint 1 16.0 1174 7.45 17.0 1174 7.87 18.0 1174 7.64 Joint 2 16.0 1174 7.45 17.0 1154 7.87 18.0 1124 7.64 Joint 3 16.0 1174 7.55 17.0 1174 7.67 18.0 1174 7.84
The difference in the file is that Parameter 2
is replaced with Parameter2
. So the question is: Is whitespace in the header names possible? If so, things get more complicated.
Taking up your idea in the comment this should also work:
# Extracting the column names (could also be done manually) with open('ouput.txt') as fin: for _ in range(3): next(fin) columns = ['Joint'] + next(fin).split() # Transforming file into a csv-file (ignoring the header lines) with open('ouput.txt', 'r') as fin, open('ouput-tidy.csv', 'w') as fout: next(fin) for line in fin: joint = line.strip() next(fin) next(fin) for line in fin: if line.startswith('--'): break fout.write(','.join([joint] + line.split()) + 'n') # Reading csv-file in DataFrame df = pd.read_csv('ouput-tidy.csv', header=None) df.columns = columns df.set_index(['Joint', 'Time'], drop=True, inplace=True)