Not necessarily. Java can be insanely performant, far more than I ever gave it credit for in the first decade of its existence. There has been a ton of optimization and you can now saturate your links even if you do fairly heavy processing. I'm still not a fan of the language but performance issues seem to be 'mostly solved'.
You can achieve optimized C/C++ speeds, you just can't program the same way you always have. Step 1, switch your data layout from Array of Structures to Structure of Arrays. Step 2, after initial startup switch to (near) zero object creation. It's a very different way to program Java.
You have to optimize your memory usage patterns to fit in CPU cache as much as possible which is something typical Java develops don't consider. I have a background in assembly and C.
I'd say it's slightly harder since there is a little bit of abstraction but most of the time the JIT will produce code as good as C compilers. It's also an niche that often considers any application running on a general purpose CPU to be slow. If you want industry leading speed you start building custom FPGAs.
Java has significant overhead, that most/every object is allocated on heap, synchronized and has extra overhead of memory and performance to be GC controlled. Its very hard/not possible to tune this part.
You program differently for this niche in any language. The hot path (number crunching) thread doesn't share objects with gateway (IO) threads. Passing data between them is off heap, you avoid object creation after warm up. There is no synchronization, even volatile is something you avoid.
how exactly you are passing data? You can pass some primitives without allocating them on heap. You can use some tiny subset of Java+standard library to write high performance code, but why would you do this instead of using Rust or C++?
Strangely this is one of the areas where I want to use project panama so I might re-implement some of the ring buffers constructs.
You allocate off heap memory and dump data into it. With modern Java classes like Arena, MemoryLayout, and VarHandle it's honestly a lot like C structs.
Depends. Many reasons, but one is that Java has a much richer set of 3rd party libraries to do things versus rolling your own. And often (not always) third party libraries that have been extensively optimized, real world proven, etc.
Then things like the jit, by default, doing run time profiling and adaptation.
There are actually cases when Java (the HotSpot JVM) runs faster than the same logic written in C/C++ because the JVM is doing dynamic analysis and selective JIT compilation to machine code.
I personally know of an HFT firm that used Java approximately a decade ago. My guess would be they're still using it today given Java performance has only improved since then.
Optimal in what sense? In the java shops I've worked at it's usually viewed as a pretty optimal situation to have everything in one language. This makes code reuse, packaging, deployment, etc much simpler.
In terms of speed, memory usage, runtime characteristics... sure there are better options. But if java is good enough, or can be made good enough by writing the code correctly, why add another toolchain?
> But if java is good enough, or can be made good enough by writing the code correctly,
"writing code correctly" here means stripping 95% of lang capabilities, and writing in some other language which looks like C without structs (because they will be heap allocated with cross thread synchronization and GC overhead) and standard lib.
Its good enough for some tiny algo, but not good enough for anything serious.
It's good enough for the folks who choose to do it that way. Many of them do things that are quite "serious"... Databases, kafka, the lmax disruptor, and reams of performance critical proprietary code have been and continue to be written in java. It's not low effort, you have to be careful, get intimate with the garbage collector, and spend a lot of time profiling. It's a totally reasonable choice to make if your team has that expertise, you're already a java shop, etc. I no longer make the choice to use java for new code. I prefer rust. But neither choice is correct or incorrect.
> Databases, kafka, the lmax disruptor, and reams of performance critical proprietary code have been and continue to be written in java.
those have low bar of performance, also they mostly became popular because of investments from Java hype, and rust didn't exist or had weak ecosystem at that time.
Not necessarily. Java can be insanely performant, far more than I ever gave it credit for in the first decade of its existence. There has been a ton of optimization and you can now saturate your links even if you do fairly heavy processing. I'm still not a fan of the language but performance issues seem to be 'mostly solved'.
"Saturating your links" is rarely the goal in HFT.
You want low deterministic latency with sharp tails.
If all you care about is throughput then deep pipelines + lots of threads will get you there at the cost of latency.
You can achieve optimized C/C++ speeds, you just can't program the same way you always have. Step 1, switch your data layout from Array of Structures to Structure of Arrays. Step 2, after initial startup switch to (near) zero object creation. It's a very different way to program Java.
You have to optimize your memory usage patterns to fit in CPU cache as much as possible which is something typical Java develops don't consider. I have a background in assembly and C.
I'd say it's slightly harder since there is a little bit of abstraction but most of the time the JIT will produce code as good as C compilers. It's also an niche that often considers any application running on a general purpose CPU to be slow. If you want industry leading speed you start building custom FPGAs.
As long as you tune the JVM right it can be faster. But its a big if with the tune, and you need to write performant code
Java has significant overhead, that most/every object is allocated on heap, synchronized and has extra overhead of memory and performance to be GC controlled. Its very hard/not possible to tune this part.
You program differently for this niche in any language. The hot path (number crunching) thread doesn't share objects with gateway (IO) threads. Passing data between them is off heap, you avoid object creation after warm up. There is no synchronization, even volatile is something you avoid.
> Passing data between them is off heap
how exactly you are passing data? You can pass some primitives without allocating them on heap. You can use some tiny subset of Java+standard library to write high performance code, but why would you do this instead of using Rust or C++?
In some places I'm using https://github.com/aeron-io/agrona
Strangely this is one of the areas where I want to use project panama so I might re-implement some of the ring buffers constructs.
You allocate off heap memory and dump data into it. With modern Java classes like Arena, MemoryLayout, and VarHandle it's honestly a lot like C structs.
I answered "why" in another post in this thread.
Depends. Many reasons, but one is that Java has a much richer set of 3rd party libraries to do things versus rolling your own. And often (not always) third party libraries that have been extensively optimized, real world proven, etc.
Then things like the jit, by default, doing run time profiling and adaptation.
Java has huge ecosystem in enterprise dev, but very unlikely it has ecosystem edge in high performance/real time compute.
There are actually cases when Java (the HotSpot JVM) runs faster than the same logic written in C/C++ because the JVM is doing dynamic analysis and selective JIT compilation to machine code.
I personally know of an HFT firm that used Java approximately a decade ago. My guess would be they're still using it today given Java performance has only improved since then.
it doesn't mean Java is optimal or close to optimal choice. Amount of extra effort they do to achieve goals could be significant.
Optimal in what sense? In the java shops I've worked at it's usually viewed as a pretty optimal situation to have everything in one language. This makes code reuse, packaging, deployment, etc much simpler.
In terms of speed, memory usage, runtime characteristics... sure there are better options. But if java is good enough, or can be made good enough by writing the code correctly, why add another toolchain?
> But if java is good enough, or can be made good enough by writing the code correctly,
"writing code correctly" here means stripping 95% of lang capabilities, and writing in some other language which looks like C without structs (because they will be heap allocated with cross thread synchronization and GC overhead) and standard lib.
Its good enough for some tiny algo, but not good enough for anything serious.
It's good enough for the folks who choose to do it that way. Many of them do things that are quite "serious"... Databases, kafka, the lmax disruptor, and reams of performance critical proprietary code have been and continue to be written in java. It's not low effort, you have to be careful, get intimate with the garbage collector, and spend a lot of time profiling. It's a totally reasonable choice to make if your team has that expertise, you're already a java shop, etc. I no longer make the choice to use java for new code. I prefer rust. But neither choice is correct or incorrect.
> Databases, kafka, the lmax disruptor, and reams of performance critical proprietary code have been and continue to be written in java.
those have low bar of performance, also they mostly became popular because of investments from Java hype, and rust didn't exist or had weak ecosystem at that time.