Hacker News

trvz a day ago [ - ]

  ollama launch claude --model gemma4:26b

gcampos a day ago [ - ]

You need to increase the context window size or the tool calling feature wont work

mil22 21 hours ago [ - ]

For those wondering how to do this:

  OLLAMA_CONTEXT_LENGTH=64000 ollama serve

or if you're using the app, open the Ollama app's Settings dialog and adjust there.

Codex also works:

  ollama launch codex --model gemma4:26b

datadrivenangel a day ago [ - ]

It's amazing how simple this is, and it just works if you have ollama and claude installed!

pshirshov a day ago [ - ]

For some reason, that doesn't work for me, claude never returns from some ill loop. Nemotron, glm and qwen 3.5 work just fine, gemma - doesn't.

trvz a day ago [ - ]

Since that defaults to the q4 variant, try the q8 one:

  ollama launch claude --model gemma4:26b-a4b-it-q8_0

pshirshov a day ago [ - ]

Even tried gemma4:31b and gemma4:31b with 128k context (I have 72GiB VRAM). Nothing. I'm cursed I guess. That's ollama-rocm if that matters (I had weird bugs on Vulkan, maybe gemma misbehaves on radeons somehow?..).

UPD: tried ollama-vulkan. It works, gemma4:31b-it-q8_0 with 64k context!

alfiedotwtf 14 hours ago [ - ]

The default context is 128k for the smaller Gemma 4’s and 256k for the bigger ones, so you’re cutting off context and it doesn’t know how to continue.

Bump it to native (or -c 0 may work too)

pshirshov 11 hours ago [ - ]

In that case the model descriptor on ollama.com is incorrect, because it defaults to 16k. So I have to manually change that to 64/128k. I think you are talking about maximum context size.

trvz 10 hours ago [ - ]

No, the default context in Ollama varies by the memory available: https://docs.ollama.com/context-length