I use forward and backward slash on Windows all the time. I think it's less of a problem of spaces in filenames, which makes a lot of pipe-related stuff harder to do (see find -print0 and xargs -0).

Spaces in filenames in the command line on Windows is a horror in part because there is no kernel level equivalent of argv. In the eyes of the Windows kernel, the command line is an unparsed utf-16 string, and it's up to the application or C library to tokenize it. This means there is no standard way to tokenize it. Ultimately, this means there is no standard way to escape a shell command's arguments, as you would need to do for spaces. Most people just put it in double quotes but this is pretty much the honor system.

I'm not entirely sure what you're saying is different here.

I picked the first MSFT page on argv, https://learn.microsoft.com/en-us/cpp/c-language/parsing-c-c... and compiled and ran that program with various command lines, and it does exactly what I expected (argv array members are tokenized according to my quoting).

Could you be clearer, perhaps by posting a simple C example and a few command lines demonstrating what you mean?

Here's an example of someone citing a disagreement between CRT and shell32:

https://github.com/rust-lang/rust/issues/44650

This in addition to the Rust CVE mentioned elsewhere in the thread which was rooted in this issue:

https://blog.rust-lang.org/2024/04/09/cve-2024-24576.html

Here are some quick programs to test contrasting approaches. I don't have examples of inputs where they parse differently on hand right now, but I know they exist. This was also a problem that was frequently discussed internally when I worked at MSFT.

    #include <stdio.h>
    int main(int argc, char **argv)
    {
       for (int i=0; i<argc; ++i)
          puts(argv[i]);
       return 0;
    }

    #pragma comment(lib, "shell32.lib")
    #include <windows.h>
    #include <stdio.h>
    int main()
    {
       PCWSTR cmdLine = GetCommandLineW();
       if (cmdLine)
       {
          int argc = 0;
          PWSTR *argv = CommandLineToArgvW(cmdLine, &argc);
          if (argv)
          {
             for (int i=0; i<argc; ++i)
                printf("%ls\n", argv[i]);
             LocalFree(argv);
             return 0;
          } 
       }
       return 1;
    }

> it's up to the application or C library to tokenize it

Technically yes, but practically the only correct way to tokenize these command line strings is CommandLineToArgvW WinAPI implemented by Shell32.dll.

This is fine for parsing command line arguments in your own programs, but unhelpful when attempting to provide a mechanism to reliably pass a user-supplied argument list to an arbitrary external program that is free to interpret the string returned by GetCommandLine arbitrarily (so the notion of "passing a list of arguments" isn't even generally applicable).

cmd.exe is an important and terrible example:

https://learn.microsoft.com/en-us/windows-server/administrat...

Within Microsoft when I was there, it was considered taboo to introduce a dependency to shell32 where it did not already exist. I tried to add a call to CommandLineToArgvW once and it got flagged.

From the part of Microsoft that was focused on cleaning up the windows code base in that era, the shell was not considered a core component. All of the utility functions that it provides like this were considered baggage, improper layering, poor design.

Just use PowerShell. It has no issues with spaces anywhere, and uses explicit quoting rules similar to C# and other “proper” languages.

If powershell is dispatching to other cmdlets then fine. If it needs to call CreateProcess on a win32 program then the issue I'm talking about definitely exists.

(I don't personally like powershell but I don't think my gripes with it are really relevant to this thread.)

I don't think anybody else here is talking about CreateProcess. We're talking about the shell.

This is how a shell works. It works by creating processes.

On Unix, you create a process with fork() and execve(). Execve takes a list of arguments, already parsed and tokenized. A shell is free to interpret the command line however it likes to build this, but at the syscall level it is completely unambiguous what the parsed array looks like.

On Windows, you call CreateProcess. This takes the command line as an unparsed string. It is the child process's responsibility to chop it up into argv. This is the source of ambiguity that I'm talking about.

The comment I wrote starting this thread wasn't about command line parsing, it was about pipes, and the need to use a non-space delimiter NULL to pass lists of files that contain spaces.

What you're talking about is related, but different. You're saying that Windows forces an unnecessary join and split (with all the complex rules required to properly parse a quoted string) when using the Windows equivalent of exec(). which is inelegant, but unrelated to what I'm describing above (what I describe above differs because of how newlines and spaces are treated as delimiters (IFS)).

Further, from the testing I did, there is no noticeable impact on my C application which reads from argv- the C runtime seems to handle this transparently.

> Further, from the testing I did, there is no noticeable impact on my C application which reads from argv- the C runtime seems to handle this transparently.

I disagree with your assessment here.

First, the C runtime doesn't have the "part of the system" feel that it does in the Unix world. There is a CRT in the OS, but most programs link to the CRT that ships with visual studio, so it could theoretically change version to version, compiler to compiler.

Apart from that a Win32 application could choose to bypass the CRT's entry point and use WinMain(). Many do. Even if it does use CRT main there is GetCommandLine() to reach the raw command line underneath. Shell32 even has a function CommandLineToArgvW which behaves differently from CRT. Those two are different from parsing done by cmd.exe. There is a lot of ambiguity is my point.