I think this thing you mentioned is more about reverse-engineering web-search tool call to understand how model formulate their response.

The tool i’ve didn’t see - “custdevs for agents”. So we can simulate choosing process for them in thousands of different scenarios. And then compare how tasty product looks for Claude or Gemini or any other LLM

Correct me if i’m wrong :)