Skip to content
Advertisement

Why do I have more row after the left merge and drop_duplciates()?

iso_selected.shape gives the result: (257, 2)

gravity.shapegives the result: (4428288, 79)

and I merge them in the following way:

gravity1 = gravity.merge(iso_selected, how="left", left_on = "iso3_o", right_on="iso3").drop_duplicates()

gravity1.shape gives the result: (4571136, 81)

Why would I have more rows than 4428288 ?

Advertisement

Answer

You need remove duplicates before DataFrame.merge by column iso3_o:

gravity1 = gravity.merge(iso_selected.drop_duplicates(subset=["iso3_o"]), 
                         how="left", 
                         on = "iso3_o")
User contributions licensed under: CC BY-SA
10 People found this is helpful
Advertisement