> Me having a device with SmartNIC and full driver support to play with

Same. I have a Pi Pico with PIO, though

> but the current "Golden Thread" correlation architecture fundamentally requires userspace + kernel cooperation that can't be fully offloaded.

Hard limit, I guess.

(If you indent all lines of a block of text with two spaces (including blank newlines), HN will format it as monospace text and preserve line breaks.)

I've updated the Architecture diagrams to include everything: https://github.com/NoFear0411/spliff/blob/main/README.md#arc...

Thanks for the format tip.

So I went looking for TLS accelerator cards again:

/? TLS accelerators open: https://www.google.com/search?q=TLS+accelerators+open :

- "AsyncGBP+: Bridging SSL/TLS and Heterogeneous Computing with GPU-Based Providers" https://ieeexplore.ieee.org/document/10713226 .. https://news.ycombinator.com/item?id=46664295

/? XDP hardware offload to GPU: https://www.google.com/search?q=XDP+hardware+offload+to+a+GP... :

- eunomia-bpf/XDP-on-GPU: https://github.com/eunomia-bpf/XDP-on-GPU

Perhaps AsyncGBP+ + XDP-on-GPU would solve.

The AsyncGBP+ article mentions support for PQ on GPU.

But then process isolation on GPUs. And they removed support for vGPU unlock.

That is a rabbit hole that I don't wanna go down to again.