Sorry, I don’t understand the point you’re making. If CodeQL reports that you have a XSS vulnerability in your code, and its report includes the complete and specific code path that creates that vulnerability, how is Rice’s theorem applicable here? We’re not talking about decidability of some semantic property in the general case; we’re talking about a specific claim about specific code that is demonstrably true.
Rice’s theorem applies to any non-trivial semantic property.
Looking at the docs, I’m not really sure CodeQL is semantic in the same sense as Rices theorem. It looks syntactic more than semantic.
Eg breaking Rices theorem would require it to detect that an application isn’t vulnerable if it contains the vulnerability but only in paths that are unreachable. Like
I’m not at a PC right now, but I’d be curious if CodeQL thinks that’s vulnerable or not.It’s probably demonstrably true that there is syntactically a path to the vulnerability, I’m a little dubious that it’s demonstrably true the code path is actually reachable without executing the code.
> We’re not talking about decidability of some semantic property in the general case; we’re talking about a specific claim about specific code
Is CodeQL special cased for your code? I very much doubt that. Then it must work in the general case. At that point decidability is impossible and at best either false positives or false negatives can be guaranteed to be absent, but not both (possibly neither of them!)
I don't doubt CodeQL claims can be demonstrably true, that's still coherent with Rice's theorem. However it does mean you'll have false negatives, that is cases where CodeQL reports no provable claim while your code is vulnerable to some issues.
OK, but all I said before was that CodeQL’s approach where it supplies a specific example to support a specific problem report is inherently resistant to false positives.
Clearly it is still possible to generate a false positive if, for example, CodeQL’s algorithm thinks it has found a path through the code where unsanitised user data can be used dangerously, but in fact there was a sanitisation step along the way that it didn’t recognise. This is the kind of situation where the theoretical result about not being able to determine whether a semantic property holds in all cases is felt in practical terms.
It still seems much less likely that an algorithm that needs to produce a specific demonstration of the problem it claims to have found will result in a false positive than the kind of naïve algorithms we were discussing before that are based on a generic look-up table of software+version=vulnerability without any attempt to determine whether there is actually a path to exploit that vulnerability in the real code.