how to eliminate duplicate rows in column A keeping the maximum value in B in python

Question

I&#8217;m working with data from an excel file like this. I&#8217;m using this line of code to eliminate the duplicates keeping the maximum df_clean=df_raw.sort_values(&#8216;A&#8217;, ascending=False).drop_duplicates(&#8216;B&#8217;).sort_index() but I&#8217;m obtaining this error Index([&#8216;B&#8217;], dt…

Accepted Answer

If I can assume that your index is just a RangeIndex then I think what you are looking for is:df_clean=df_raw.sort_values('A', ascending=False).drop_duplicates('B', ignore_index=True)and not sort_index()

Advertisement

Answer