Pandas DataFrame Dividing a column by itself taking first element and divide all the rows and so on

I have a DataFrame from Pandas:

import pandas as pd

data1 = {"a":[1.,3.,5.,2.]}

df1 = pd.DataFrame(data1)

JavaScript
​x
 
import pandas as pd
​
data1 = {"a":[1.,3.,5.,2.]}
​
df1 = pd.DataFrame(data1)
​

df1:

JavaScript
 
    a
0   1.0
1   3.0
2   5.0
3   2.0
​

Now I want to iterate over the rows. For every row, divided by the first elements of the same column and then iterate elements.

Taking all the rows one by one and divided by the first element as standard in the denominator and all rows with second elements and so on.

For example: 1./1., 3./1., 5./1., 2./1. and take next element 1./3., 3./3., 5./3., 2./3. and next, 1./5.,3./5.,5./5.,2./5. and then last 1./2., 3./2., 5./2., 2./2.

and python code works well with for loop.

def div(x):

    d = []

    for i in range(len(x)):

        for j in range(len(x)):

            di = x.iloc[j,:]/x.iloc[i,:]

            d.append(di)

    return d


div(df1)

JavaScript
 
def div(x):
​
    d = []
​
    for i in range(len(x)):
​
        for j in range(len(x)):
​
            di = x.iloc[j,:]/x.iloc[i,:]
​
            d.append(di)
​
    return d
​
​
div(df1)
​

If I have huge dataset like 15000 rows and I want to implement using applymap() or map()

result = df.applymap(div)

JavaScript
 
result = df.applymap(div)
​

result = map(div,df1)

print(list(result))

JavaScript
 
result = map(div,df1)
​
print(list(result))
​

It gives an error for both. Maybe if anyone could help me to optimize my code would appreciate it.

Thanks in advance

Answer

You can use numpy to increase the speed of the process:

>>> df1
     a
   a_1
   a_2
0  1.0
1  3.0
2  5.0
3  2.0

JavaScript
 
>>> df1
     a
   a_1
   a_2
0  1.0
1  3.0
2  5.0
3  2.0
​

import numpy as np

a = np.hstack(df1.values)
m = np.repeat(a, len(a)).reshape((a.shape[0], -1))

df = pd.DataFrame(a / m, columns=df1.columns.repeat(len(df1)))

JavaScript
 
import numpy as np
​
a = np.hstack(df1.values)
m = np.repeat(a, len(a)).reshape((a.shape[0], -1))
​
df = pd.DataFrame(a / m, columns=df1.columns.repeat(len(df1)))
​

>>> df
          a
        a_1
        a_2  a_2       a_2       a_2
0  1.000000  3.0  5.000000  2.000000
1  0.333333  1.0  1.666667  0.666667
2  0.200000  0.6  1.000000  0.400000
3  0.500000  1.5  2.500000  1.000000

>>> a
array([1., 3., 5., 2.])

>>> m
array([[1., 1., 1., 1.],
       [3., 3., 3., 3.],
       [5., 5., 5., 5.],
       [2., 2., 2., 2.]])

JavaScript
 
>>> df
          a
        a_1
        a_2  a_2       a_2       a_2
0  1.000000  3.0  5.000000  2.000000
1  0.333333  1.0  1.666667  0.666667
2  0.200000  0.6  1.000000  0.400000
3  0.500000  1.5  2.500000  1.000000
​
>>> a
array([1., 3., 5., 2.])
​
>>> m
array([[1., 1., 1., 1.],
       [3., 3., 3., 3.],
       [5., 5., 5., 5.],
       [2., 2., 2., 2.]])
​

Used function and method:

Advertisement

Answer