> The Win32 API. E.g. using WriteFile to write files (https://learn.microsoft.com/en-us/windows/win32/api/fileapi/...)
Which is called from what, if not C? Does windows really offer no API for writing text (rather than bytes) to files? Or does it rely on the application developer to manage line endings in their own code? Neither of those sounds very developer-friendly.
Calling it from C does not mean you need a full C standard library to exist. For example, much of the C standard library is itself written in C. But it's a "freestanding" C which assumes only a minimal set of library functions exist (e.g. functions for copying memory from one place to another, filling memory with zeroes, etc).
And you can of course use non-C languages to call the Win32 API. Or even directly using assembly code.
> you can of course use non-C languages to call the Win32 API. Or even directly using assembly code.
Is that a supported/official API though? On Linux you "can" put your arguments in registers and trigger the system call interrupt directly, and I think Go programs even do this, but it's not the official interface and they reserve the right to break your program in future updates, at least in theory.
This is incorrect. The syscall ABI is the supported stable ABI for Linux, not the libc API - there's no single supported C library for Linux, and libc often lags behind the kernel in terms of providing syscall wrappers, so punting it to that level wouldn't work. This is in contrast to the BSDs that have libc tightly coupled to the kernel.
Of course, the Linux solution results in some weirdness, especially because specs like POSIX cover the C API, not the syscall ABI. setuid() at the libc layer is specced as changing the UID for all threads in a process. The Linux setuid() syscall only changes the current thread[1], and it's up to the C library to do some absolute magic to then propagate that to all other threads. Which made things difficult for things not using the C library, like Go (https://github.com/golang/go/issues/1435). But that's still not an argument that the supported interface is the C library - the kernel advertises the interface it exposes via the syscall ABI, and will retain that functionality, and if you want POSIX compatibility then you get it from somewhere else.
[1] In Linux, a thread is just a very slightly special case of a process
Sure. C has never been the only language supported on Windows.
For instance, Delphi had a period of popularity for Windows application development, and AFAIK it has always used its own runtime library which is completely independent of the C runtime.
Go does not trigger low-level system call interrupts on Windows. (It does that on Linux, but Windows syscall numbers are not stable even across minor Windows updates, so if Go did that, its Windows binaries would be incredibly fragile.)
On Windows NT, Go uses the userspace wrappers provided in Windows system libraries such as NTDLL.DLL and KERNEL32.DLL. But those too are entirely separate from the C runtime.
Don't forget the days when multiple C/C++ implementations from multiple vendors all came with their own runtime library DLLs, too.
Calling win32 from other languages is supported, calling it from assembly is supported (as long as you use the calling convention properly, obviously), using ntdll to bypass the win32 API is not supported.
Basically on Linux the syscalls are the equivalent of Win32 except much narrower in scope.
> using ntdll to bypass the win32 API is not supported
But it is sometimes required to do things properly.
> Is that a supported/official API though?
The Win32 API doesn't even use the "C" calling convention. C is just another language to Windows and the standard C library is a cross-platform library for C. You could also write C code on classic Mac OS and it had it's own API as well but more styled for Pascal.
The OS and C being closely related is not universal across all operating systems, it's just a Unix thing.
> Which is called from what, if not C?
A prominent example is Delphi[1]. At work our primary application is a 20 year old Delphi Win32 application, which we ship new features in weekly.
Delphi does not rely on the C runtime, instead having its own system library which interfaces with the Win32 API that gets compiled in.
[1]: https://en.wikipedia.org/wiki/Delphi_(software)
From literally any language. The WriteFile function comes from kernel32.dll shared library, and follows the certain calling convention. You don't need to use this calling convention inside your own binary (and indeed, MinGW and MSYS use SysV ABI for everything except when calling Win32 API), or ask a random C runtime coming from God knows where to do this for you if you write something other than C.
In the UNIX world there is this strange notion that C language is somehow special and that the OS itself should provide its runtime (a single global version of it) for every program, even those written in other languages, to interact with the OS but... it's just silly.
> Does windows really offer no API for writing text (rather than bytes) to files? Or does it rely on the application developer to manage line endings in their own code? Neither of those sounds very developer-friendly.
No it doesn't. That logic belongs in the OS-specific layer in the runtimes/standard libraries of the implementations of the different programming languages. They may decide to re-use each other libraries, of course, or they may decide not to.
> You don't need to use this calling convention inside your own binary (and indeed, MinGW and MSYS use SysV ABI for everything except when calling Win32 API), or ask a random C runtime coming from God knows where to do this for you if you write something other than C.
Well sure but you have to define it somewhere. At some point there's an interface where something that's part of the application asks something that's part of the OS to do something, and that interface had better be stable and well-specified. If you really want you can use a different interface from your C ABI, sure, but given that, like it or not, most of windows is written in C (or in C++ but using C linkage between component boundaries), what do you gain?
Even so, most of Windows historically did not use C ABI, but rather stdcall, so specifying a call from your C library to the Windows C library couldn’t be done in a purely standards-compliant C compiler (which doesn’t have calling convention modifiers), in a slightly pedantic quirk of the C spec design
> At some point there's an interface where something that's part of the application asks something that's part of the OS to do something, and that interface had better be stable and well-specified.
It's defined, and well-specified.
> your C ABI
Which is a C ABI. Borland's Turbo C and C++Builder used different ABI than Microsoft C compiler did. GCC for Windows used to use a third, entirely different ABI as well. The ABI is not part of the language definition, you see.
> most of windows is written in C
And compiled with a very specific C compiler that used a particular ABI. That only means that you need to follow it when you call into the OS, sure, but not that you have to stick to it anywhere else — and indeed, most implementations of many programming languages on Windows didn't; they invented and used their own ABIs.
> Which is a C ABI. Borland's Turbo C and C++Builder used different ABI than Microsoft C compiler did. GCC for Windows used to use a third, entirely different ABI as well.
Sure, you can do that. Userspace code can use any ABI it wants, or none. But again, why, what do you gain?
And regardless of whether it's "the" ABI or merely "a" ABI, that ABI presumably has a representation for strings and allows passing them around - and while you certainly could use a different representation in your program (or in the OS internals) and transform strings back and forth when calling the OS (or when receiving calls from userspace), you probably don't want to. At which point we're back at needing a way to write strings in an in-memory format to OS-standard files in the filesystem.
> But again, why, what do you gain?
Performance? Codegen simplicity? Why, again, must one use the syscall ABI for anything that is not a syscall?
> that ABI presumably has a representation for strings and allows passing them around
In this particular case, the API operates with binary buffers, not text strings. Sure, you can go the VMS way, or even IBM way, and turn files from binary blobs into arrays of fixed-length records (that's why C's fwrite/fread have both num and sz arguments: some OSes literally can't write data any other way).
> At which point we're back at needing a way to write strings in an in-memory format to OS-standard files in the filesystem.
Yes? Some text editors converted LFs to NULs to work on the text in memory, and then they'd convert NULs back into LFs on writing to the disk (IIRC). Both emacs and vi don't store text in memory the way it's layed out in the file; they translate it when writing to the disk.
Again, why do you want the OS to get involved into any of this? It's not the OS's job, period, stop trying to make the world an even worse place.
> Performance? Codegen simplicity? Why, again, must one use the syscall ABI for anything that is not a syscall?
If you can figure out an ABI that gives you significant advantages, sure, knock yourself out. But given that you're going to have to implement the syscall one anyway, if there's no compelling reason to use a different one then why make things more complicated?
> Again, why do you want the OS to get involved into any of this? It's not the OS's job, period
Again, why have a filesystem if you're not going to have any standardised structure for how to use it? Why have an OS at all if you're not going to give programs ways to interact with each other? The OS owns the filesystem, it should also define how it's used.
> stop trying to make the world an even worse place
Right back at you.
The whole issue is specific to C and languages that copied C or use its runtime underneath in implementations (like Python)
For reference, Unix has no API other than bytes either.
> The whole issue is specific to C and languages that copied C or use its runtime underneath in implementations (like Python)
So it's "specific to" almost all programming languages in actual use. That's a rather esoteric point.
> For reference, Unix has no API other than bytes either.
Unix does offer an API for writing C-standard in-memory text strings to Unix-standard on-disk text files, it just happens to be the same one as the API for writing in-memory binary strings to on-disk binary files.
> Unix does offer an API for writing C-standard in-memory text strings
Why on bloody Earth should a presumably generic-purpose OS provide a special API for dealing with internal representation of some data structure in a (particular) implementation of a (particular) programming language?
Besides, it doesn't offer such an API anyhow; you need to take care to manually pass the result of a strlen() call instead of sizeof()'s as the value for the len parameter of a write() call, otherwise a NUL-terminator will get written into the file as well.
And C says nothing about what constitutes a line break, by the way. Nor does it have any concept of a "line", or any utilities for working with lines specifically, it only knows of strings, and that's all. The concept of "text line" is POSIX.
> Why on bloody Earth should a presumably generic-purpose OS provide a special API for dealing with internal representation of some data structure in a (particular) implementation of a (particular) programming language?
Because the purpose of the OS is to facilitate applications (and, on the other end, facilitate hardware), and those applications tend to have a need to process text in-memory and then store it on the filesystem?
All you need for that is the ability to read and write binary blobs to and from files, which Windows gives you, and to know what "text files" means for the other programs on that platform. Windows itself doesn't care for text much; but the other programs have a shared convention that ASCII text files have CRLF-separated variable-length lines of text, and Unicode text files store text in UTF16-LE, (including the CRLF pairs, so those look like "\x0D\x00\x0A\x00" as raw bytes).
All of this is left to the user space to sort out, just as it is on Linux, so I am not entirely sure why you demand Windows to do more for you than Linux does.
The OS is the one providing the filesystem, it should define and support how it's used (including providing standard utilities for manipulating it, both from programs or by the operator) rather than leaving the programs to figure it out between themselves. (After all, if the text storage format didn't matter to the OS, why would we bother using the CRLF format on windows at all? I submit that third-party programs did not spontaneously come up with an arbitrary convention that everyone would use a different text format on Windows; rather programs use CRLF when running on Windows precisely because the standard utilities that ship as part of DOS/Windows expect that format)
As already stated multiple times here, the CRLF is actually the "correct" way (at least in the telex days, where CR and LF have actual meanings of "Return Carriage to home" and "Feed a new Line"), while the LF-only one is a Unix "hack"/abstraction (which was actually converted back into CRLF if fed to a telex or a terminal). It is not really a surprise that DOS, which was inspired by CP/M, simply copied what was supposed to be a physical signal. This is the reason the ASCII/ANSI code has a BEL indicator for ringing a bell. In short, CRLF is the way to handle newlines at the time that DOS was designed. You will expect that CRLF is the ending because that's how terminals work (unlike with the magicking Unix which smooshes two differing things into a character).
If you are writing a developer suite, whether you're Delphi developing for MS-DOS or Microsoft developing for Apple II, you kinda have the idea of how things should work (because you have the reference book for the platform, not the compiler/language). It is not the assumption that the OS provides abstraction for text - in thise days, everyone just implement it from scratch, really ("code page" was from literal code pages, where each character has a well-defined byte). This is manifested in command-line handling on Windows: the platform convention is that it is just a flat string, and the C runtime determines how to chop that up (MSVC and Intel C has historically disagreed heavily here) The abberation of Windows only having CRLF is because Unix-based designs took over the world: macOS is Unix, Linux was insiped by Unix, *BSD was Unix-derived.
It still shows up in IETF-style textual network protocols, which evolved on non-Unix systems (HTTP, SMTP, etc.)
MTA-STS, a very recent standard (RFC 8461), only allows CRLF as the line terminator (to the chagrin of *nix lovers, and to the fact that a majority of mail systems are being operated on *nix systems)
Peril of writing protocols with an eye to debug them using cheapest terminal you can find on campus and a grad student paid in coffee
This discussion is rather weird because hardly anyone writes raw C programs in Windows any more, and especially not Microsoft. Generally they expect developers to have moved to C++ or C#, both of which offer richer APIs.
The real fragmentation is not CRLF but the transition to system level UTF-16 support, involving all sorts of macros and duplicating almost every OS API function into FooW() and FooA() variants.
From whatever programming language you feel like using.