Skip to content
Advertisement

How can I compare files quicker in Python?

Is there any way to make this script faster? I’m using one file to compare another file to print lines, if second column are equal.

JavaScript

Input example (for both files):

JavaScript

The command line below works equally for same purpose in bash:

JavaScript

How can I improve this Python script?

Advertisement

Answer

If you store your lines in dictionaries that are keyed by the column that you are interested in, you can easily use Python’s built-in set functions (which run at C speed) to find the matching lines. I tested a slightly modified version of this (filenames changed, and changed split('t') to split() because of stackoverflow formatting) and it seems to work fine:

JavaScript
User contributions licensed under: CC BY-SA
3 People found this is helpful
Advertisement