How do I remove duplicates from a list, while preserving order?

Question

How do I remove duplicates from a list, while preserving order? Using a set to remove duplicates destroys the original order. Is there a built-in or a Pythonic idiom? Answer Here you have some alternatives: http://www.peterbe.com/plog/uniqifiers-benchmark Fastest one: Why assign seen.add to seen_add instead o…

Accepted Answer

Here you have some alternatives: http://www.peterbe.com/plog/uniqifiers-benchmarkFastest one:def f7(seq):    seen = set()    seen_add = seen.add    return [x for x in seq if not (x in seen or seen_add(x))]Why assign seen.add to seen_add instead of just calling seen.add? Python is a dynamic language, and resolving seen.add each iteration is more costly than resolving a local variable. seen.add could have changed between iterations, and the runtime isn&#8217;t smart enough to rule that out. To play it safe, it has to check the object each time.If you plan on using this function a lot on the same dataset, perhaps you would be better off with an ordered set: http://code.activestate.com/recipes/528878/O(1) insertion, deletion and member-check per operation.(Small additional note: seen.add() always returns None, so the or above is there only as a way to attempt a set update, and not as an integral part of the logical test.)

Advertisement

Answer