How to remove duplicate values in one column but keep the rows pandas?

Question

I have dataframe as per below Country: China, China, China, United Kingdom, United Kingdom,United Kingdom Country code: CN, CN, CN, UK, UK, UK Port Name: Yantian, Shekou, Quanzhou, Plymouth, Cardiff, Bird port I want to remove the duplicates in the first two columns, only keep as: Country: China, , , United Kingdom, , Country code: CN, , , UK, ,

Accepted Answer

You could use the pd.Series.duplicated method:import pandas as pddf = pd.DataFrame(    [        ['China', 'CN', 'Yantian'],        ['China', 'CN', 'Shekou'],        ['China', 'CN', 'Quanzhou'],        ['United Kingdom', 'UK', 'Plymouth'],        ['United Kingdom', 'UK', 'Cardiff'],        ['United Kingdom', 'UK', 'Bird port']    ],    columns=['Country', 'Country code', 'Port Name'])for col in ['Country', 'Country code']:    df[col][df[col].duplicated()] = np.NaNprint(df)printsindexCountryCountry codePort Name0ChinaCNYantian1NaNNaNShekou2NaNNaNQuanzhou3United KingdomUKPlymouth4NaNNaNCardiff5NaNNaNBird port

index	Country	Country code	Port Name
0	China	CN	Yantian
1	NaN	NaN	Shekou
2	NaN	NaN	Quanzhou
3	United Kingdom	UK	Plymouth
4	NaN	NaN	Cardiff
5	NaN	NaN	Bird port

Advertisement

Answer