Tools that remove the fat seem like a good idea, but I’m highly suspicious of their effect on the LLM’s reasoning.
LLMs were trained in the typical full-fat output found everywhere on the internet, and all of sudden they get a slightly different response that may look like nothing they have seen before.
Does that really save tokens in the long run?
I have just been using it for 2 months, so... lmao. might need a year and with more users to test out how it will go.