Today at work we had a situation where a server wasn’t coming back on the network after a reboot. Normally this wouldn’t be a big deal, but the server was across the ocean in a vastly different timezone, so troubleshooting normally has to be done in a 2-hour window.
It turned out that the sk98lin module that we were using for our nic has been superceeded/deprecated by skge. So I wanted to test out the new module on the new kernel version, but didn’t have someone on the other end to reboot and types things in on the console should something go wrong. I needed a way to make the server roll back the changes and reboot if it did not see me come back to the server after 10 minutes. I didn’t have any handy script to do this, so I wrote one up.
fixmyself.pl checks for a condition that you specify in the subroutine test_condition() and if the test fails (such as not finding any processes running on pts0 through 9), then it executes a response_action() subroutine. In this case it finds the changes I made, changes them back, and reboots the server.
I hope you find it useful: fixmyself.pl