I found that keeping current context utilization at 18% of total context length was best for minimizing spend, across all models with 400k context length or more
I found that keeping current context utilization at 18% of total context length was best for minimizing spend, across all models with 400k context length or more