Skip to content

Python – compare two string by words using difflib and print only difference

Python newbie here. I have the following code to compare two strings using difflab library. The output is prefixed with ‘+’,’-‘ for words which are different. How to get only the differences printed without any prefix?

The expected output for the below code is

Not in first string: Nvdia

Not in first string: IBM

Not in second string: Microsoft

Not in second string: Google

Not in second string: Oracle

or just Nvdia, IBM, Microsoft, Google, Oracle

import difflib

original = "Apple Microsoft Google Oracle"
edited = "Apple Nvdia IBM"

# initiate the Differ object
d = difflib.Differ()

# calculate the difference between the two texts
diff = d.compare(original.split(), edited.split())

# output the result
print ('n'.join(diff))

Thanks!

Answer

If you don’t have to use difflib, you could use a set and string splitting!

>>> original = "Apple Microsoft Google Oracle"
>>> edited = "Apple Nvdia IBM"
>>> set(original.split()).symmetric_difference(set(edited.split()))
{'IBM', 'Google', 'Oracle', 'Microsoft', 'Nvdia'}

You can also get the shared members with the .intersection()

>>> set(original.split()).intersection(set(edited.split()))
{'Apple'}

The Wikipedia has a good section on basic set operations with accompanying Venn diagrams
https://en.wikipedia.org/wiki/Set_(mathematics)#Basic_operations


However, if you have to use difflib (some strange environment or assignment) you can also just find every member with a +- prefix and slice off the all the prefixes

>>> diff = d.compare(original.split(), edited.split())
>>> list(a[2:] for a in diff if a.startswith(("+", "-")))
['Nvdia', 'IBM', 'Microsoft', 'Google', 'Oracle']

All of these operations result in an iterable of strings, so you can .join() ’em together or similar to get a single result as you do in your Question

>>> print("n".join(result))
IBM
Google
Oracle
Microsoft
Nvdia