Python newbie here. I have the following code to compare two strings using difflab library. The output is prefixed with ‘+’,’-‘ for words which are different. How to get only the differences printed without any prefix?
The expected output for the below code is
Not in first string: Nvdia
Not in first string: IBM
Not in second string: Microsoft
Not in second string: Google
Not in second string: Oracle
or just Nvdia, IBM, Microsoft, Google, Oracle
import difflib original = "Apple Microsoft Google Oracle" edited = "Apple Nvdia IBM" # initiate the Differ object d = difflib.Differ() # calculate the difference between the two texts diff = d.compare(original.split(), edited.split()) # output the result print ('n'.join(diff))
Thanks!
Advertisement
Answer
If you don’t have to use difflib
, you could use a set
and string splitting!
>>> original = "Apple Microsoft Google Oracle" >>> edited = "Apple Nvdia IBM" >>> set(original.split()).symmetric_difference(set(edited.split())) {'IBM', 'Google', 'Oracle', 'Microsoft', 'Nvdia'}
You can also get the shared members with the .intersection()
>>> set(original.split()).intersection(set(edited.split())) {'Apple'}
The Wikipedia has a good section on basic set operations with accompanying Venn diagrams
https://en.wikipedia.org/wiki/Set_(mathematics)#Basic_operations
However, if you have to use difflib
(some strange environment or assignment) you can also just find every member with a +-
prefix and slice off the all the prefixes
>>> diff = d.compare(original.split(), edited.split()) >>> list(a[2:] for a in diff if a.startswith(("+", "-"))) ['Nvdia', 'IBM', 'Microsoft', 'Google', 'Oracle']
All of these operations result in an iterable of strings, so you can .join()
’em together or similar to get a single result as you do in your Question
>>> print("n".join(result)) IBM Google Oracle Microsoft Nvdia