>> how the runbooks can self heal if results from some steps in the middle are not expected.

Yeah this is a very interesting angle. Our primary mechanism here is via agent created auto-memories today. The agent keeps track of the most useful steps, and more importantly, dead end steps as it executes runbooks. We think this offers a great bridge to suggest runbook updates and keep them current.

>> Curious how much savings do you observe from using runbook versus purely let Claude do the planning at first.

Really depends on runbook quality, so I don't have a straightforward answer. Of course, it's faster and cheaper if you have well defined steps in your runbooks. As an example, `check logs for service frontend, faceted by host_name`, vs. `check logs`. Agent does more exploration in the latter case.

We wrote about the LLM costs of investigating production alerts more generally here, in case helpful: https://relvy.ai/blog/llm-cost-of-ai-sre-investigating-produ...