I have a three column of data as arranged below:
Input file:
JavaScript
x
13
13
1
>>>>>
2
1.0 2.0 3.0
3
2.0 2.0 4.0
4
3.0 4.5 8.0
5
>>>>>
6
1.0 2.5 6.8
7
2.0 3.5 6.8
8
3.0 1.2 1.9
9
>>>>>
10
1.0 1.2 1.3
11
2.0 2.7 1.8
12
3.0 4.5 8.5
13
In the above input file the first column values are repeated so I want to take only once that value and want to sum the third column values row wise and do not want to take any second column values.
I also want to append a third column with the fixed value 1.0
Finally want to save the result on another test file called output.txt.
Output:
JavaScript
1
4
1
1.0 11.1 1.0
2
2.0 12.6 1.0
3
3.0 18.4 1.0
4
In the output second column values resulted from is following:
JavaScript
1
4
1
3.0+6.8+1.3
2
4.0+6.8+1.8
3
8.0+1.9+8.5
4
I tried with numpy but getting error:
JavaScript
1
5
1
import numpy as np
2
import pandas as pd
3
import glob
4
data=np.loadtxt("input.txt")
5
Advertisement
Answer
You need to read your input file using pandas.read_csv
, you need to set the delimiter to " "
, specify no header and ">"
as comment lines.
Then perform the groupby
/sum
operation, and export without header using pandas.to_csv
JavaScript
1
11
11
1
import pandas as pd
2
3
# input
4
df = pd.read_csv('filename.csv', delimiter=' ', header=None, comment='>')
5
6
# output
7
(df.groupby(0)[[2]].sum()
8
.assign(col=1.0)
9
.to_csv('output.txt', header=False, sep=' ', float_format='%.2f')
10
)
11
output.txt:
JavaScript
1
5
1
1.00 11.10 1.00
2
2.00 12.60 1.00
3
3.00 18.40 1.00
4
5