In a large codebase there will still be bugs in how these components interoperate with each other, bugs involving complex chaining of api logic or a temporal element. These are the kind of bugs fuzzers generally struggle at finding. I would be a little freaked out if LLMs started to get good at finding these. Everything I've seen so far seems similar to fuzzer finds.