Using Python 3.4.3 on Windows.
My script runs a little java program in console, and should get the ouput:
import subprocess p1 = subprocess.Popen([ ... ], stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True) out, err = p1.communicate(str.encode("utf-8"))
This leads to a normal
‘UnicodeDecodeError: ‘charmap’ codec can’t decode byte 0x9d in position 135: character maps to < undefined>’.
Now I want to ignore errors:
out, err = p1.communicate(str.encode(encoding="utf-8", errors="ignore"))
This leads to a more interesting error I found no help for using google:
TypeError: descriptor ‘encode’ of ‘str’ object needs an argument
So it seems that python does not even know anymore what the arguments for str.encode(…) are. The same also applies when you leave out the errors part.
Advertisement
Answer
universal_newlines=True
enables text mode. Combined with stdout=PIPE
, it forces decoding of the child process’ output using locale.getpreferredencoding(False)
that is not utf-8 on Windows. That is why you see UnicodeDecodeError
.
To read the subprocess’ output using utf-8 encoding, drop universal_newlines=True
:
#!/usr/bin/env python3 from subprocess import Popen, PIPE with Popen(r'C:pathtoprogram.exe "arg 1" "arg 2"', stdout=PIPE, stderr=PIPE) as p: output, errors = p.communicate() lines = output.decode('utf-8').splitlines()
str.encode("utf-8")
is equivalent to "utf-8".encode()
. There is no point to pass it to .communicate()
unless you set stdin=PIPE
and the child process expects b'utf-8'
bytestring as an input.
str.encode(encoding="utf-8", errors="ignore)
has the form klass.method(**kwargs)
. .encode()
method expects self
(a string object) that is why you see TypeError
.
>>> str.encode("abc", encoding="utf-8", errors="ignore") #XXX don't do it b'abc' >>> "abc".encode(encoding="utf-8", errors="ignore") b'abc'
Do not use klass.method(obj)
instead of obj.method()
without a good reason.