Funny, we are working to implement this same logic in our in-house financial categorization agent. When we have a repeat prompt it goes to a json that stores answers and only goes to AI for edge cases.
It’s a good idea
Funny, we are working to implement this same logic in our in-house financial categorization agent. When we have a repeat prompt it goes to a json that stores answers and only goes to AI for edge cases.
It’s a good idea
Awesome to hear you’ve done similar. JSON artifacts from runs seem to be a common approach for building this in house, similar to what we did with the muscle mem. Detecting cache misses is a bit hard without seeing what the model sees, part of what inspired this proxy direction.
Thanks for the nice words!