but is it still terrible at tool calls in actual agentic flows?