I’m interested to know how to elegantly go about splitting a single-columned file of the following format into a more classic tabular layout using Pandas.
(The file is received as an output from an eye tracker.)
Current Format:
JavaScript
x
5
1
TimeStampGazePointXLeftGazePointYLeftGazePointXRightGazePointYRight
2
00000000.11111111111111.22222222222222.33333333333333.4444444444444
3
00000000.11111111111111.22222222222222.33333333333333.4444444444444
4
00000000.11111111111111.22222222222222.33333333333333.4444444444444
5
Desired Format:
JavaScript
1
5
1
TimeStamp GazePointXLeft GazePointYLeft GazePointXRight GazePointYRight
2
000000000 11111111111111 22222222222222 333333333333333 444444444444444
3
000000000 11111111111111 22222222222222 333333333333333 444444444444444
4
000000000 11111111111111 22222222222222 333333333333333 444444444444444
5
Where I’m stuck:
I imagine the solution will involve Pandas’ split
method but I’m having trouble figuring out how to get there. I imagine I’ll have to “manually” add the respective columns while somehow splitting period-delimited rows of data…
JavaScript
1
7
1
df = pd.DataFrame('data.csv')
2
3
headers = ["TimeStamp", , "GazePointYRight"]
4
5
for header in headers:
6
df[header] = df[1:].split(".")[headers.index(header)] <--- # Splitting rows by period and taking data based on header index in list
7
I’d really appreciate some direction. Thanks in advance.
Advertisement
Answer
pandas.read_...
has several usefull parameters to play with.
I believe you want something like this?
JavaScript
1
12
12
1
import pandas as pd
2
3
columns_names = [
4
'TimeStamp',
5
'GazePointXLeft',
6
'GazePointYLeft',
7
'GazePointXRight',
8
'GazePointYRight',
9
]
10
11
df = pd.read_csv("lixo.csv", sep='.', skiprows=1, names=columns_names)
12