Where do you think those bugs reports for gcc and others come from? Some people do look at the assembly coming out of the compilers.
Currently the openbsd mailing list for port is currently going through a clang update and one of the main point is looking at all the packages that failed to build. I even took a long look at the usb stack and the audio subsystem of OpenBSD because of an issue I was having with my DAC.
I literally do packaging for a living and you are misunderstanding my point. Most people just take a binary and run it. There's no analysis of the assembly code. You might profile it and bench it after the fact but no one is sitting there looking at the assembly line by line unless there's a very very good reason and frankly LLMs are better at that type of investigative work. I know because I've been investigating some curious 1 in 100,000 segfaults recently and guess what? It took an LLM to build a tool to let us even hit that bug because it was basically impossible to do by hand and no one in the before times would have sat down to write the tool cause we would not have time so we would have just accepted that 1 in 100,000 requests are segfaulting. At least now I can actually fix the problem.
What's the reliability of compilers this day? How likely for a bug to be in your code and not in the compiler? I think it's close to 99.99...
So when you have a bug and a core dump, you can quickly load it in debugger, see the stack frame and then theorize a model for the bug to happen. If after verifying the source and having complete confidence that it's good, then you start looking at the assembly, most likely while single stepping with the debugger. But you rarely get to that point, because 99.99... it's your code.
That reliability is what AI tooling is lacking. It's exhausting monitoring the output because errors can be as simple as a minus character or the wrong comparison operator.
I'm usually compiling other people's code. Hitting that 1 in 100,000 issue in run time and then having to come up with patch. And then have to make sure it's okay in arm and amd64. The bug I'm thinking of is decidedly a human output and the LLM is cleaning up the slop.