Are current compiler optimisations limited by algorithm or by compute?

would a x1000 compute cluster provide a meaningful performance boost ( of the generated binaries)?

Damn, I am being nerd-sniped here :)

One thing is that you can think of static analysis as building facts about the program. You can for example start by assuming nothing and then adding facts about the program. And you need to iteratively propagate these facts from one line of code to the next. But you can also start by assuming the universe and removing facts from this universe.

Some classes of program analysis are safe to stop early. For example, if I have a static analysis that tries to find the target of virtual calls (also known as a devirtualization), you can stop early after a time out. Not finding the target just implies a missed optimization.

There are some other classes of program analysis that are not safe until the algorithm finishes. For example, if you have to prove that two variables do not alias each other, you cannot stop until you have all possible points-to sets and verify that for each of those two variables, their points-to sets do not overlap.

So, given the above restriction, the first class (early termination) is perhaps more desirable and throwing more compute time would yield a better approximation. For the second one, it wouldn't.

Another thing to keep in mind is that most of these data flow frameworks are not easily parallelized. The only paper I've read (but I haven't kept up with these avenue of research) that implemented a control flow analysis in the GPU is the following:

* Prabhu, Tarun, et al. "EigenCFA: Accelerating flow analysis with GPUs." ACM SIGPLAN Notices 46.1 (2011): 511-522.

I'm sure people are working on it. (I should mention that there are some program analyses written in Datalog and Datalog can be parallelized, but I think this is a processor based parallelization and not a GPU one).

The third thing is that when you say whether we are limited by algorithms or compute, I think it is important to note that it is impossible to find all possible facts *precisely* about a program without running it. There is some relation between static program analysis and the halting problem. We want to be able to guarantee termination of our static program analysis and some facts are just unobtainable without running. However, there is not just static program analyses, but also dynamic program analyses which can analyze a program as it is running. An example of a dynamic program analysis can be value profiling. Imagine that you have a conditional and for 99% of the time, the conditional is false. With a virtual machine, you can add some instrumentation to find out the probability distribution of this conditional and then generate code without this condition, optimize the code, and only if the condition is false then run a less optimized version of the code with an additional penalty. Some virtual machines already do this for types and values. Type profiling and value profiling.

One last thing, when you say a meaningful performance boost, it depends on your code. If your code can be folded away completely at compile time, then yes, we could just generate the solution at compile time and that's it. But if it doesn't or parts of it cannot be folded away / the facts cannot be used to optimize the code, then no matter how much you search, you cannot optimize it statically.

Compilers are awesome :)

As an addendum, it might be desirable in the future to have a repository of analyzed code. Compilers right now are re-analyzing code on every single compile and not sharing their results across the web. It is a fantasy of mine to have a repository that maps some code with equivalent representations and every time one does a local compile it explores a new area and adds it to the repository. Essentially, each time you compile the code, it explores new potential optimizations and all of them get stored online.

> repository of analyzed code

yes, this is obviously what you want. there shouldn't be a notion of 'running' the compiler; it should simply be a persistent process. importantly, there are two issues, which are orthogonal but both critically important: iterativity—that is, produce an acceptable result immediately, but then continue to refine it—and incrementality—that is, when making a small change to the code, take advantage of information learned about the previous version.

there is this recent paper on incrementality https://arxiv.org/pdf/2104.01270.pdf, though it has some caveats (e.g. it does not handle iterativity at all, so there is only so far it can go; can do much better)—but it is interesting non-technically because the authors work for amazon, so perhaps some of this is headed for industry

> it might be desirable in the future to have a repository of analyzed code.

We did a code verifier back end for a small app store. Were mostly MScs + some Phd consulting.

The world is indeed waiting for a CloudOptimizingCompiler (CoCo?!) - there are so many possibilities for shared compute, library caching, and machine learning.

I will PM you, just to keep the dream alive.