Interesting setup

do you have any benchmarks on: - token usage over time - failures/retry rates

would be great to see how it behaves in production