club-3090 with llamacpp did it. Full 262k context, usable in oh-my-pi. Still testing it, but initial results are promising.
I had to make a couple of adjustments though. After downloading the model with hf, I needed to move the mmproj-F16.gguf to the parent folder:
tree /media/fast-storage/club-3090-models/qwen3.6-27b/
/media/fast-storage/club-3090-models/qwen3.6-27b/
├── mmproj-F16.gguf
└── unsloth-q3kxl
└── Qwen3.6-27B-UD-Q3_K_XL.gguf
then, on starting the server, the container would complain that llama-server wasn't a known binary, so I needed to add PATH="/app:$PATH" to the entrypoint of the llama service.The only things that's missing is for llama to emit thinking blocks that oh-my-pi can parse, but it's running alright. That's mostly cosmetic.