Hacker News

The whole time while reading over this, I was thinking how a small orchestrator local model might help with somewhat known workflows. Programmatic orchestration is ideal, but can be impractical for all cases. In the interest of reducing context pollution, improving speed, and providing a better experience; I would think the ideal hierarchy for orchestration would be programmatic > tiny local LLM > frontier LLM. The tiny model doesn't need to be local as computers have varying resources.

I would think there would be some things a tiny model would be capable of competently managing and faster. The tiny model's context could be regularly cleared, and only relevant outputs could be sent to the larger model's context.