I’ve been wanting to implement a more “overzealous” approach to cleanup orphaned pods from analytical workflows (Prefect) that hang on to expensive compute resources, sometimes it feels frustratingly out of control. It’s really difficult to get good signal from the noise on if it’s actually orphaned (due to the things you’ve mentioned); killing a workload that isn’t actually orphaned can be very costly due to re-runs. Commenting out of solidarity here, but also curious to see others chime in their approach.