I'm a bit confused about what makes this course "advanced." Most of the topics (dead code elimination, data flow, dominator analysis, SSA form) seem like they belong in a first course on compilers.
I'm a bit confused about what makes this course "advanced." Most of the topics (dead code elimination, data flow, dominator analysis, SSA form) seem like they belong in a first course on compilers.
Well, course numbers are regular enough that you can look up what the "intro compilers" course is: https://www.cs.cornell.edu/courses/cs4120/2026sp/?schedule
The short answer is that compilers is basically broken up into two courses, with the first course largely being the minimum necessary to build a compiler (lexing, parsing, codegen, register allocation), and the second course being how to build an optimizing compiler.
I am actively opposed to this design for a first compiler. There is no need for a lexer with a recursive descent parser. Register allocation is also an unnecessary distraction. It is better for a first compiler to compile to a higher level language in which neither register assignment nor memory management are necessary.
Optimizing compilers are suboptimal in that they waste enormous amount of time optimizing code that can't or needn't be optimized and even where the optimizations are helpful, they are opaque and at risk of unexpectedly regressing both due to small changes at the source code level or changes in the compiler optimizer, both of which are quite insidious.
If instead of optimizing compilers, we had languages that allowed for seamless interop between low level and high level functions, then perhaps an llm becomes the optimizer (or you can invoke the compiler to optimize a specific function by source level rewrite). The benefit of this compared to a traditional optimizing compiler is that the optimization is done once per function and never repeated (until prompted) and the implementation is human readable and not buried in a binary. Moreover, and perhaps even more importantly, by not doing optimizations in the compiler, compilation times can be much faster, easily 100-1000x than state of the art optimizing compiler, while generating equivalent or even better runtime performance. As it has been said: premature optimization is the root of all evil.
The academic literature on register allocation is scary.
First is presented a linear time optimal algorithm for graph coloring then it is claimed better can be done by a O(N^2) algorithm that uses a heuristic.
I do believe the dragon book got caught with the emperor's new register allocator and the literature hasn't really recovered yet.
Looks like there is quite a bit of overlap with regards to the optimizing parts between these two courses. I guess it's switching from the dragon book to academic papers that makes it advanced.
I have read TONS of material about it*, and none of that is part of the majority of that!
In fact, the "backend" be compiler or interpreter is nearly always left as "exercise to reader".
You can't imagine how much is left to be discovered, from how make a closure, track environment, do pattern matching, memory representation, etc.
EVERYTHING interesting is something you need to look for.
P.D: This only one of the years:https://gist.githubusercontent.com/mamcx/e1743571b9a1ea163a7...
I think a lot of the non-professionals start with parsing and do not get exposed to backend. I have read two books about interpreters/compilers and they don't touch the backend very much.
Maybe this is introductory for backend?
That's part of it. I think another part is that it seems like the students are asked to read the papers behind a lot of the concepts, and academic literature is not generally very accessible to undergrads. (Not that they can't read it, but without someone guiding you through at least the first few papers, it can be a frustrating experience for many.)
What is advanced then? Good coverage of dce, data flow, ssa, intruction selection and reg alloc is actually like 98% of the backend.
Perhaps polyhedral optimization, tiling, scalar evolution, vectorization...
I guess garbage collection is pretty advanced in the syllabus.
Well there is advanced and there is advanced.
This course is just a second course on compilers for people who had an introduction. And a great one at that.
A good modern, practical and decently optimizing compiler can do just fine without all the things you've mentioned, including vectorization.
Besides, most programming language implementations never go beyond the basic SSA toolkit.