ollama launch claude --model gemma4:26b

You need to increase the context window size or the tool calling feature wont work

For those wondering how to do this:

  OLLAMA_CONTEXT_LENGTH=64000 ollama serve
or if you're using the app, open the Ollama app's Settings dialog and adjust there.

Codex also works:

  ollama launch codex --model gemma4:26b

It's amazing how simple this is, and it just works if you have ollama and claude installed!

For some reason, that doesn't work for me, claude never returns from some ill loop. Nemotron, glm and qwen 3.5 work just fine, gemma - doesn't.

Since that defaults to the q4 variant, try the q8 one:

  ollama launch claude --model gemma4:26b-a4b-it-q8_0

Even tried gemma4:31b and gemma4:31b with 128k context (I have 72GiB VRAM). Nothing. I'm cursed I guess. That's ollama-rocm if that matters (I had weird bugs on Vulkan, maybe gemma misbehaves on radeons somehow?..).

UPD: tried ollama-vulkan. It works, gemma4:31b-it-q8_0 with 64k context!

The default context is 128k for the smaller Gemma 4’s and 256k for the bigger ones, so you’re cutting off context and it doesn’t know how to continue.

Bump it to native (or -c 0 may work too)

In that case the model descriptor on ollama.com is incorrect, because it defaults to 16k. So I have to manually change that to 64/128k. I think you are talking about maximum context size.

No, the default context in Ollama varies by the memory available: https://docs.ollama.com/context-length