Hacker News

arthurcolle 14 hours ago [ - ]

I found that keeping current context utilization at 18% of total context length was best for minimizing spend, across all models with 400k context length or more