What will the time complexity of this python program in Big O notation?

I find it difficult to calculate the time complexity of this program as it involves a lot of built-in methods. Could anyone please help? Basically the question is to find topper of each subject and 3 overall best performers!

from sys import argv
df=pd.read_csv(sys.argv[1])
subjects=['Maths','Biology','Physics','English','Chemistry','Hindi']
total=[]
for column in subjects:
  a=df[column].max()  #finding the maximum value in each column
  b=df.loc[(df[column]==a),['Name']] #locating the corresponding row of the found maximum value
print("Topper in "+column+" is "+re.sub("[|]|'","",str(b.values.tolist())))


df['total']=df['Maths']+df['Biology']+df['Physics']+df['Chemistry']+df['Hindi']+df['English']
df_v1=df.sort_values(by=['total'],ascending=False)
print("Best students in this class are: ")
for i in range(3):
 print(str(i+1)+"."+df_v1.iloc[i]['Name'])

JavaScript
​x
 
from sys import argv
df=pd.read_csv(sys.argv[1])
subjects=['Maths','Biology','Physics','English','Chemistry','Hindi']
total=[]
for column in subjects:
  a=df[column].max()  #finding the maximum value in each column
  b=df.loc[(df[column]==a),['Name']] #locating the corresponding row of the found maximum value
print("Topper in "+column+" is "+re.sub("[|]|'","",str(b.values.tolist())))
​
​
df['total']=df['Maths']+df['Biology']+df['Physics']+df['Chemistry']+df['Hindi']+df['English']
df_v1=df.sort_values(by=['total'],ascending=False)
print("Best students in this class are: ")
for i in range(3):
 print(str(i+1)+"."+df_v1.iloc[i]['Name'])
​

Input csv file looks something like this:

Name  Physics Chemistry Biology Maths Hindi English
Steve  99     1000      100     95    97    85
John    80     90        75     70    100   100

JavaScript
 
Name  Physics Chemistry Biology Maths Hindi English
Steve  99     1000      100     95    97    85
John    80     90        75     70    100   100
​

Output:

  Topper in maths is X
  Topper in physics is y
Overall best students are X,y,z

JavaScript
 
  Topper in maths is X
  Topper in physics is y
Overall best students are X,y,z
​

Answer

Your for loop goes over all columns for each row => O(row * col) complexity.
Calculation of totals does the same => O(row * col)
The sort_values sorts all values in one column, and usually, sort functions are O(nLog(n)) in theory, so this gives us O(row * Log(row))

All in all, we have O(row * col) + O(row * col) + O(row * log(row) => O(row * col)

So the answer is O(row * col)

Edit

If col << row, you might actually get O(rowlog(row)). So if the number of columns is finite, it is actually O(rowlog(row))

Advertisement

Answer