Skip to content
Advertisement

Directory parameter on Windows has trailing backslash replaced with double quote when passed to Python

I am seeing an annoying trailing double quote when passing in a quoted directory in Windows to Python.

This is my Python test program, print_args.py:

import sys
print(sys.argv)

This is what I get when I run it from the command line. Note that the quoted directory is the standard format generated by tab completion in the Windows shell. The double quotes are needed because of the spaces in the path.

>py print_args.py -test "C:Documents and Settings"
['print_args.py', '-test', 'C:\Documents and Settings"']

The trailing backslash has been replaced with a double quote, presumably because Python is reading it as a quoted double quote, rather than matching it to the leading quote.

If instead of passing the parameter to Python, I pass it to a batch script which just echoes it, then I get the trailing backslash as expected.

So somewhere between the CMD shell and Python seeing sys.argv there has been some parsing which has affected backslashes and double quotes.

Can anyone illuminate?

Edited to add:

Further reading suggests to me that Python is doing some parsing of the windows command line arguments to construct sys.argv. I think Windows passes the entire command line string, in this case mostly unchanged, to Python and Python uses its own internal logic to break it into the strings in the sys.argv list. This processing must allow escaped double quotes as a special case. I would be pleased to see some documentation or the code…

Advertisement

Answer

Command line processing on Windows is not completely standardised but in the case of Python and many other programs it uses the Microsoft C runtime behaviour. This is specified, for example, here. It says

A string surrounded by double quote marks is interpreted as a single argument, which may contain white-space characters. […] If the command line ends before a closing double quote mark is found, then all the characters read so far are output as the last argument.

A double quote mark preceded by a backslash (“) is interpreted as a literal double quote mark (“).

The second of these two rules prevents the second double quote being read as terminating the argument – instead a double quote is appended. Then the first rule allows the argument to end without a terminating double quote.

Note that this section also says

The command line parsing rules used by Microsoft C/C++ code are Microsoft-specific.

This is even more confusing when using PowerShell.

PS> py print_args.py -test 'C:Documents and Settings'
['print_args.py', '-test', 'C:\Documents and Settings"']

Here PowerShell parsing preserves the final backslash and drops the single quotes. Then it adds double quotes (because of the spaces in the path) before passing the command line to the C runtime which parses it according to the rules, escaping the double quote added by PowerShell.

However, this all does conform to the documented behaviour and is not a bug.

User contributions licensed under: CC BY-SA
4 People found this is helpful
Advertisement