Minimal working example:
In [3]: a = ('r1', 'r2', 'r11', 'r6', 'r1', 'r2', 'r7', 'r8') In [4]: b = ('r1', 'r2', 'r1', 'r6', 'r1', 'r2', 'r7', 'r8') In [5]: list(difflib.ndiff(a, b)) Out[5]: [' r1', ' r2', '- r11', '? -n', '+ r1', ' r6', ' r1', ' r2', ' r7', ' r8']
Can someone please explain why there’s a newline character as the fourth element in the output list? What can I do to not get that element as ndiff
output, but only the rest of the list?
Advertisement
Answer
Because ndiff
expects the lines you pass in to end with newline characters, like this:
a = ('r1n', 'r2n', 'r11n', 'r6n', 'r1n', 'r2n', 'r7n', 'r8n') b = ('r1n', 'r2n', 'r1n', 'r6n', 'r1n', 'r2n', 'r7n', 'r8n')
In the docs for difflib.Differ.compare
, which is what .ndiff()
calls under the hood, we see this (emphasis mine):
compare(a, b)Compare two sequences of lines, and generate the delta (a sequence of lines).
Each sequence must contain individual single-line strings ending with newlines. Such sequences can be obtained from the
readlines()
method of file-like objects. The delta generated also consists of newline-terminated strings, ready to be printed as-is via thewritelines()
method of a file-like object.
The output you’re getting makes sense, lines that start with ?
are for highlighting what changed. In this case it’s drawing a -
under the second 1
in r11
to show you that it was deleted. difflib
is expecting that you will use the output like this
print(''.join(difflib.ndiff(a, b)))
so it needs to end any lines it adds with a newline.
You can add the newlines to your original values with a list comprehension
a = [line + "n" for line in a] b = [line + "n" for line in b]