> wedged the machine so bad the out of band management also went down!
Now thats living the dream of a shared cluster!
This is hazy now, but I do remember a massive outage of a lustre cluster, which I think was because there was a dodgy node injecting crap into everyone's memory space via the old lustre fast filesystem kernel driver. I think they switched to NFS export nodes after that. (for the render farm and desktops at least.)