Skip to content
Advertisement

Remove duplicates from a list of lists based on duplicate first elements [closed]

data is below

data = [["'id'", "'state'", "'country'n"],
        ['44', "'WD'", "'India'n"],
        ['5', "'WD'", "'India'n"],
        ['44', "'WD'", "'Japan'n"],
        ['390', "'WD'", "'Japan'n"],
        ['17', "'WD'", "'Japan'n"],
        ['17', "'WD'", "'BEL'"]]

How to remove the duplicate elements in id.

Here 44, 17 id is repeating

Expected

[["'id'", "'state'", "'country'n"]
['44', '1', "'WD'", "'India'n"]
['5', "'WD'", "'India'n"]
['390', "'WD'", "'Japan'n"]
['17', "'WD'", "'Japan'n"]]

Pseudo code

l = []

for i in range(len(a)):
    print (a[i])
    if i[0] == a[i][1]:
        pass
    else:
        l.append(i)

Advertisement

Answer

You can use a dict here:

unique_data = {}

for sub_data in data:
    sub_data_id = sub_data[0]

    if sub_data_id not in unique_data:
        unique_data[sub_data_id] = sub_data

The structure of unique_data will be like this:

{
    "'id'": ["'id'", "'state'", "'country'"], 
    '44': ['44', '1', "'WD'", "'India'"], 
    '5': ['5', "'WD'", "'India'"], 
    '390': ['390', "'WD'", "'Japan'"], 
    '17': ['17', "'WD'", "'Japan'"]
}

To then get the unique items, we can use list(unique_data.values()), which gives us:

[["'id'", "'state'", "'country'"], ['44', '1', "'WD'", "'India'"], ['5', "'WD'", "'India'"], ['390', "'WD'", "'Japan'"], ['17', "'WD'", "'Japan'"]]
User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement