Can someone smarter than me explain what they mean by "reified generics", "erased generics", and a use case for when to use one over the other?
Can someone smarter than me explain what they mean by "reified generics", "erased generics", and a use case for when to use one over the other?
With reified generics, the code
compiles to: On the other hand, erased generics compiles to this:Example, Java is using erased generics. Once the code is compiled, the generics information is no longer in the bytecode. List<String> becomes List<>. This is called type erasure.
C# is using reified generics where this information is preserved. List<String> is still List<String> after compilation
And as a consequence, C# can pack the value types directly in the generic data structure, instead of holding references to heap-allocated objects.
This is very important both for cache locality and for minimizing garbage collector pressure.
> And as a consequence, C# can pack the value types directly in the generic data structure, instead of holding references to heap-allocated objects.
> This is very important both for cache locality and for minimizing garbage collector pressure.
How is C# just not straight-up faster than Java then? Instead of both language punching around the same weight on benchmarks? Doesn't cache locality like, have a huge effect on performance?
There’s more to the speed of a language than this one thing.
In many aspects C# is. I remember listening to a talk from Microsoft (admittedly) where using 100% latest features was on average faster than Java
I have no answer except that Java Hotspot JIT seems to be almost too good to be true. I guess it's my way of saying I would also like to know why C# isn't just plain faster than Java.
With reified generics, you can also do "new T[]" because the type is known at runtime. With type erasure, you can't do that.
And Java has been working on Project Valhalla for ~20 years to retrofit the ability to do this to the existing Java language...
The goal for Valhalla is value types, reiifed generics if they ever happen is still open.
The project was announced in July 2014, hardly 20 years.
Also the reason they are still at it, is how to run old JARs withouth breaking semantics, in a Valhalla enabled JVM.
Had Oracle wanted to do a Python 3, Valhalla would have been done by now, however we all know how it went down, and Java 9 was already impactful enough to the ecosystem.
> The goal for Valhalla is value types, reified generics if they ever happen is still open.
But if they want the List<int> use case to be fast they basically have to keep this information at runtime and will have to make changes to how objects are laid out in memory. I'm not sure there's a good way around that if you want List<int> to be backed by an int[] and `get` returning an int instead of an Object. This may or may not be available to developers and remain internal to the JVM in the beginning, but I think it's necessary to enable the desired performance gains.
They also state on the website: »Supplementary changes to Java’s generics will carry these performance gains into generic APIs.«
Haskell and OCaml are two runtimes that do just good enough with type erasure for how the polymorphic types get implemented across their implementations.
Probably MLton is the only implementation that actually does it the C++ and Rust way.
So lets see how far they go.
I always considered it was a mistake for Java to ignore what GC enabled languages were doing at the time, Eiffel, Modula-3, Oberon and frieds, which they naturally looked into given their influences, but it wasn't deemed necessary for the original Java purposes of being a settop box and applets language.
Now we have a good case of what happens when we tried to retrofit such critical features after decades of field usage, a lesson that Go folks apparently failed to learn as well.
Reified Generics doesn't seem to be a goal mentioned on the project website- Am I missing something?
https://openjdk.org/projects/valhalla/
There is an interesting article which mentions reification, but that's all I could locate.
How We Got the Generics We Have (Or, how I learned to stop worrying and love erasure)
https://openjdk.org/projects/valhalla/design-notes/in-defens...
Reified generics aren't on the board, but a solution to:
> And as a consequence, C# can pack the value types directly in the generic data structure, instead of holding references to heap-allocated objects.
is what Project Valhalla is all about. (Java doesn't have a good reason for being able to do `new T` at the moment, but being able to treat a generic container as optimizable-over-structs is an explicit goal).
Incidentally if you do what they're proposing for PHP in Java (where you define a non-generic subclass of a generic type), the actual generic type parameters actually are in the bytecode, and depending on the static type you use to reference it, may or may not be enforced...
That prints out:I'm not smarter than you but.
I believe the terms reified generics and erased generics is the type sweaty donkey ball terminology you get for professional CS academics.
Sticking my neck out further.
Reified generics means the type is available at run time. In C# you can write if(obj.GetType() == typeof(typename))
Erased generics the type information is not available at run time. That's the way Java does it and it kinda sucks.
Academics invent short names for common (in their field) concepts not because they're 'sweaty' but because if the thing you're going to mention in every second paragraph in a good chunk of the communication you do with other people working on the same topic requires a full sentence to explain you're going to A. get really annoyed at having to type it out all the time and B. probably explain it slightly differently every time and confuse people.
Academic jargon isn't invented to be elitist, it's invented to improve communication.
(of course there's a good chance you understand this already, and you're just making a dumb joke, but I figured I'd explain this anyway for the benefit of everyone reading)
I don't take issue with the naming but with the names that feel a bit beyond my ken. "Erased" makes sense when explained but not before. "Reified" is a word I simply do not use so it feels like academia run amok.
Regardless, I recognize myself as the point of failure, but those names do strike me as academia speak, though better than some/many. <shrug>
Another shrug, but part of it is that the PL community (programming language community) is pretty deep into its own jargon that doesn’t have as much overlap as you might think, with other subfields of computer science.
People describe a type system as “not well-founded” or “unsound” and those are specific jabs at the axioms, and people talk about “system F” or “type erasure” or “reification”. Polymorphism can be “ad-hoc” or “parametric”, and type parameters can be invariant, covariant, and contravariant. It’s just a lot of jargon and I think the main reason it’s not intuitive to people outside the right fields is that the actual concepts are mostly unfamiliar.
> Another shrug, but part of it is that the PL community (programming language community) is pretty deep into its own jargon that doesn’t have as much overlap as you might think, with other subfields of computer science.
The word reified dates back to the 1800s. It isn't the most common word, but it also definitely wasn't invented by the programming language community.
It was (and is) used a lot by philosophers and there was a large overlap between a certain class of philosophers and a certain class of mathematicians who developed early type theory. Any type theorist who knows his literature will run into reification very early on.
> Erased generics the type information is not available at run time. That's the way Java does it and it kinda sucks.
In a good statically-typed language you don't need runtime type information. It could be a Void in the bytecode for all I care, as long as it behaves correctly.
> obj.GetType() == typeof(typename)
In a statically-typed language, this can be optimised away to a bool at compile time.
Oh absolutely not true AT ALL.
Occasionally it is really useful to be able to do something like `new T()`. Without reified generics that is not possible.
new T? What's wrong with the old one?
Jokes aside, what's the use case for not knowing what T is until runtime?
Pretty much all polymorphism works by not knowing the concrete type til runtime. If you have an Animal reference to a Dog instance, any method you call on it is resolved at runtime, because the reference knows the type. Reified generics do the same for type parameters, whereas erased types are only used for type checking at compile time.
> Erased generics the type information is not available at run time. That's the way Java does it and it kinda sucks.
To be more precise: in Java, generics on class/method/field declarations are available at runtime via reflection. The issue is that they aren’t available for instances. So a java.util.ArrayList<java.lang.String> instance is indistinguishable at runtime from a java.util.ArrayList<java.lang.Object> instance