summing the values row wise

I have a three column of data as arranged below:

Input file:

>>>>>
1.0 2.0 3.0
2.0 2.0 4.0
3.0 4.5 8.0
>>>>>
1.0 2.5 6.8
2.0 3.5 6.8
3.0 1.2 1.9
>>>>>
1.0 1.2 1.3
2.0 2.7 1.8
3.0 4.5 8.5

JavaScript
​x
 
>>>>>
1.0 2.0 3.0
2.0 2.0 4.0
3.0 4.5 8.0
>>>>>
1.0 2.5 6.8
2.0 3.5 6.8
3.0 1.2 1.9
>>>>>
1.0 1.2 1.3
2.0 2.7 1.8
3.0 4.5 8.5
​

In the above input file the first column values are repeated so I want to take only once that value and want to sum the third column values row wise and do not want to take any second column values.

I also want to append a third column with the fixed value 1.0

Finally want to save the result on another test file called output.txt.

Output:

1.0  11.1  1.0
2.0  12.6  1.0
3.0  18.4  1.0

JavaScript
 
1.0  11.1  1.0
2.0  12.6  1.0
3.0  18.4  1.0
​

In the output second column values resulted from is following:

3.0+6.8+1.3
4.0+6.8+1.8
8.0+1.9+8.5

JavaScript
 
3.0+6.8+1.3
4.0+6.8+1.8
8.0+1.9+8.5
​

I tried with numpy but getting error:

import numpy as np
import pandas as pd
import glob
data=np.loadtxt("input.txt")

JavaScript
 
import numpy as np
import pandas as pd
import glob
data=np.loadtxt("input.txt")
​

Answer

You need to read your input file using pandas.read_csv, you need to set the delimiter to " ", specify no header and ">" as comment lines.

Then perform the groupby/sum operation, and export without header using pandas.to_csv

import pandas as pd

# input
df = pd.read_csv('filename.csv', delimiter=' ', header=None, comment='>')

# output
(df.groupby(0)[[2]].sum()
   .assign(col=1.0)
   .to_csv('output.txt', header=False, sep=' ', float_format='%.2f')
)

JavaScript
 
import pandas as pd
​
# input
df = pd.read_csv('filename.csv', delimiter=' ', header=None, comment='>')
​
# output
(df.groupby(0)[[2]].sum()
   .assign(col=1.0)
   .to_csv('output.txt', header=False, sep=' ', float_format='%.2f')
)
​

output.txt:

1.00 11.10 1.00
2.00 12.60 1.00
3.00 18.40 1.00

JavaScript
 
1.00 11.10 1.00
2.00 12.60 1.00
3.00 18.40 1.00
​
​

Advertisement

Answer