24

I was part of a on-call rotation. We had ~800 microsites with decent traffic on this one box, because that's a good idea...

One day the box was experiencing kernel panics and causing core dumps. After exhausting every possiblity I decided it was time to restart the box:

sudo shutdown now

Missed the -r and the box was not accessible remotely. Had to wait for someone at the data center to terminal in.

Downtime was ~2 hours.

This was caused by a crontab that automatically ran apt-get update & apt-get upgrade... Also made by me... None of this should have worked or allowed to be done!

Comments
Add Comment