Skip to content
Advertisement

Pandas DataFrame Dividing a column by itself taking first element and divide all the rows and so on

I have a DataFrame from Pandas:

import pandas as pd

data1 = {"a":[1.,3.,5.,2.]}

df1 = pd.DataFrame(data1)

df1:

    a
0   1.0
1   3.0
2   5.0
3   2.0

Now I want to iterate over the rows. For every row, divided by the first elements of the same column and then iterate elements.

Taking all the rows one by one and divided by the first element as standard in the denominator and all rows with second elements and so on.

For example: 1./1., 3./1., 5./1., 2./1. and take next element 1./3., 3./3., 5./3., 2./3. and next, 1./5.,3./5.,5./5.,2./5. and then last 1./2., 3./2., 5./2., 2./2.

and python code works well with for loop.

def div(x):

    d = []

    for i in range(len(x)):

        for j in range(len(x)):

            di = x.iloc[j,:]/x.iloc[i,:]

            d.append(di)

    return d


div(df1)

If I have huge dataset like 15000 rows and I want to implement using applymap() or map()

result = df.applymap(div)

or

result = map(div,df1)

print(list(result))

It gives an error for both. Maybe if anyone could help me to optimize my code would appreciate it.

Thanks in advance

Advertisement

Answer

You can use numpy to increase the speed of the process:

>>> df1
     a
   a_1
   a_2
0  1.0
1  3.0
2  5.0
3  2.0
import numpy as np

a = np.hstack(df1.values)
m = np.repeat(a, len(a)).reshape((a.shape[0], -1))

df = pd.DataFrame(a / m, columns=df1.columns.repeat(len(df1)))
>>> df
          a
        a_1
        a_2  a_2       a_2       a_2
0  1.000000  3.0  5.000000  2.000000
1  0.333333  1.0  1.666667  0.666667
2  0.200000  0.6  1.000000  0.400000
3  0.500000  1.5  2.500000  1.000000

>>> a
array([1., 3., 5., 2.])

>>> m
array([[1., 1., 1., 1.],
       [3., 3., 3., 3.],
       [5., 5., 5., 5.],
       [2., 2., 2., 2.]])

Used function and method:

User contributions licensed under: CC BY-SA
9 People found this is helpful
Advertisement