Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "$answer = reboot"
-
What an awful day :(
The server where I host my 4 clients websites crashed.
Unable to reboot from the console.
I contact the support. 15 minutes later: "we'll look at this"
No news for 1 week despite my messages.
Then... 1st ticket escalation... 2nd ticket escalation... 3rd ticket escalation...
Answer: "Sorry, your server is down and cannot be repaired."
Fuck.
I ask "is there any way to get my data back?". Answer: "No, because we would shutdown the whole bay and all our clients would be impacted".
Fuck.
I subscribe to another server, at another provider.
I look at my backups... shit, the last one is 4 month ago!!
I restore the first website: OK
I restore the second website: OK
I restore the third website: My new server is "too recent" and not compatible. with this old Wordpress. Fuck! I'll look at this later...
I restore the fourth website: database is empty!! What??? I look at the SQL backup for this site... it failed...
I lost ALL my 4th client data!!!
I'm sooooo piece of crap!14 -
I love Linux, but its community can be so full of incompetent assholes..
Just now I asked in Freenode ##linux how to get the process ID of my current running process in bash. I got my answer - it's a shell built-in called "$$".
Then people start to nitpick some more - why do you need it? How is that different from an exit? - to which my response was.. well I know the whole idea behind exit codes, and I'd use it whenever possible, in all defined behavior that allows my program to terminate itself whenever it can. This pidfile however would be used to exit itself and provide diagnostic information whenever the program enters undefined behavior - a segfault in C language. Scenarios in which I don't have full control over the script's behavior anymore, such as the system entering an unworkable state where the system stalled, still got some binaries in RAM but the rootfs got unwritable, such as now - very helpfully, thanks HP! - when my laptop likely overheated and shat itself. I issued sudo reboot into it, but even that wouldn't issue properly anymore due to the /sbin/poweroff binary becoming inaccessible too. I had to issue a hard power cycle.. one of the few times in which I'm thankful to HP for actually causing shit like this, lol.
Point is, that undefined behavior is what I'm trying to mitigate against. I certainly can't let any files other than diagnostics remain in nonvolatile storage like that, especially when their state should be predictable in order to ensure good operation (like files expressing whether the script is already running or not, i.e. lock files).
Back to that IRC chat. Aside from the answer, I got ridicule from people who probably don't even know how to properly compile a kernel. Ubuntu users, overconfident scum. Sometimes I feel like I should ask questions in channels like #archlinux only, where such incompetency is ridiculed on its own.13 -
The website for our biggest client went down and the server went haywire. Though for this client we don’t provide any infrastructure, so we called their it partner to start figuring this out.
They started blaming us, asking is if we had upgraded the website or changed any PHP settings, which all were a firm no from us. So they told us they had competent people working on the matter.
TL;DR their people isn’t competent and I ended up fixing the issue.
Hours go by, nothing happens, client calls us and we call the it partner, nothing, they don’t understand anything. Told us they can’t find any logs etc.
So we setup a conference call with our CXO, me, another dev and a few people from the it partner.
At this point I’m just asking them if they’ve looked at this and this, no good answer, I fetch a long ethernet cable from my desk, pull it to the CXO’s office and hook up my laptop to start looking into things myself.
IT partner still can’t find anything wrong. I tail the httpd error log and see thousands upon thousands of warning messages about mysql being loaded twice, but that’s not the issue here.
Check top and see there’s 257 instances of httpd, whereas 256 is spawned by httpd, mysql is using 600% cpu and whenever I try to connect to mysql through cli it throws me a too many connections error.
I heard the IT partner talking about a ddos attack, so I asked them to pull it off the public network and only give us access through our vpn. They do that, reboot server, same problems.
Finally we get the it partner to rollback the vm to earlier last night. Everything works great, 30 min later, it crashes again. At this point I’m getting tired and frustrated, this isn’t my job, I thought they had competent people working on this.
I noticed that the db had a few corrupted tables, and ask the it partner to get a dba to look at it. No prevail.
5’o’clock is here, we decide to give the vm rollback another try, but first we go home, get some dinner and resume at 6pm. I had told them I wanted to be in on this call, and said let me try this time.
They spend ages doing the rollback, and then for some reason they have to reconfigure the network and shit. Once it booted, I told their tech to stop mysqld and httpd immediately and prevent it from start at boot.
I can now look at the logs that is leading to this issue. I noticed our debug flag was on and had generated a 30gb log file. Tail it and see it’s what I’d expect, warmings and warnings, And all other logs for mysql and apache is huge, so the drive is full. Just gotta delete it.
I quietly start apache and mysql, see the website is working fine, shut it down and just take a copy of the var/lib/mysql directory and etc directory just go have backups.
Starting to connect a few dots, but I wasn’t exactly sure if it was right. Had the full drive caused mysql to corrupt itself? Only one way to find out. Start apache and mysql back up, and just wait and see. Meanwhile I fixed that mysql being loaded twice. Some genius had put load mysql.so at the top and bottom of php ini.
While waiting on the server to crash again, I’m talking to the it support guy, who told me they haven’t updated anything on the server except security patches now and then, and they didn’t have anyone familiar with this setup. No shit, it’s running php 5.3 -.-
Website up and running 1.5 later, mission accomplished.6 -
I really wanna share this with you guys.
We have a couple of physical servers (yeah, I know) provided by a company owned by a friend of my boss. One of them, which I'll refer to as S1, hosted a couple of websites based on Drupal 7... Long story short, every php file got compromised after someone used a vulnerability within D7's core to inject malicious code. Whatver, wasn't a project of mine, and no one bothered to do anything about it... The client was even happy about not doing anything about it. We did stop making backups of such websites however, to avoid spreading the damage (right?). So, no one cared about this for months!
But last monday? The physical server was offline. I powered it on again via its web management interface... Dead after less than an hour. No backups. Oh well, I guess I couls keep powering it on to check what's wrong with it and attempt to fix it...
That's when I've learned how the web management interface works: power on/reboot requests prompted actual workers to reach the physical server and press the power on/reboot buttons.
That took a while to sink in. I mean, ok, theu are physical servers... But aren't they managed anyhow? They are just... Whatever. Rebooting over and over wasn't the solution, so I asked if they could move the HDD to another of our servers... The answer was it required to buy a "server installation" package. In short, we'd have had to buy a new physical server, or renew the subscription of one we already owned for 6 months.
So... I've literally spent the rest of the day bothering their emoloyeea to reboot S1, until I've reached the "daily reboot reauests limit" (which amounts to 3 reauests. seriously), whicj magically opened a support ticket where a random guy advised to stop using VNC as "the server was responsive" and offeres to help me with the command line.
Fiiine, I sort of appreciate it. My next message has been a kernel log which shows how the OS dying out was due to physical components becoming unavailable after a while, and how S1 lacked a VNC server, being accessible only via ssh. So, the daily reboot limit was removes for S1. Yay.
...What to do though? S1 was down, we had no backups, and asking for manual rebooting every time was slow as Hell. ....Then I went insane. I asked for 1 more reboot. su. crontab -e. */15 * * * * /sbin/shutdown -r +5. while true; do; rsync --timeout=20 --append S1:/stuff .; sleep 60; done.
It worked. We have now again access to 4 hacked, shitty Drupal 7 websites. My boss stopped shouting. I can get back to my own projects.
Apparently, those D7 websites got back online too, still with malicious php code within them. Well, not my problem (for now).
Meanwhile, S1 is still rebooting.3 -
Recently I have updated my lubuntu to 18.04.
I don't use it regularly but I like to have it on the side of my window 10.
Anyway today I boot and decide to use it and get this error.
[0.000000] [Firmware Bug]: TSC_DEADLINE disabled du to errata; please update microcode to version 0x22 (or later)
and two MMIO read fault.
At first it sounds really dramatic and I was thinking, "Nice ... I never get a problem with Windows Update and when its Linux it doesn't work ..."
But lubuntu boots normally after so it's not a blocking problem.
So I do what most of us do in case like this, go to Google and search to know what the hell is going on.
And the answer is simple, my CPU microcode isn't up to date to prevent Spectre, one apt get install and a reboot later my 4700HQ is patched in 0x24 version and protected for Spectre where my windows didn't patch anything and worst disable the KB that I have installed manually before the last big update.
So thanks Linux, you scared me with your error but it was a good job to throw it :)1 -
Started up KiTTY to connect to my virtual test server per usual when I couldn't establish a connection.
Nothing too unusual so I do the typical troubleshooting I make sure host, port and authentication is all correct and it is. So now I open the display for the virtual server and start looking at ip info, host info, checking ports and everything is completely fine.
Now I'm getting frustrated so I start running things like configtest in apache, using systemctl to check the services status, even restarting virtualbox in my windows 10 devpc. Still cannot connect!
I start feeling hopeless and just shut everything down, the whole operating system.
*takes breath*
Computer boots up and I start my usual thing of creating workspaces, opening editors, starting servers, etc.
I open KiTTY again and launch my virtual test server..
konicm8ker@VM-UBUNTUSERVER:~$ _
Somethings you just can't fix without a reboot. -
My last week of vacations. A brake on bussiness programing... lol
Monday:
Receive a phone call from a colegue:
Hi the equipment it not working.
Me: ( upset with the acuracy) reboot that shit!
Colegue: Its working. Thank you.
Me: 😲😨😵😱
Today (Thursday):
Collegue: The printer is not working!
Me: 😡 Im on vacation. Check the cable or try to reinstall the printer...
Colegue: Its working. Thank you.
Me: 😱😱😱😱😱😱😱😱
2 fucking hours later:
Collegue try to call
Me: Did not answer... 😡 Fuch this shit.
Colegue send text message saying that they had a problem on the video projector but its ok now..
Me: 😠😡😢😢😢
I'M ON V A C A T I O N3 -
So simple but so hard.
Having a bad cold I'v been home for a few days. Finally I could bend down without my head exploding so I could replace a harddrive in my ceph system.
I took everything off line, installed the new drive and did all the right things,
but afterwards it didnt come up.
It didn't make sense so I googled for hours while my fever were getting stronger without finding the answer.
So I gave up and reverted my changes and plugged in the old harddrive...
It still didn't work... a bit of panic. I mean... its all my files!
After a lot of sweating (no caused by fever) I realised I moved two ceph-mon processes a few weeks ago but I never rebooted the system afterwards, to fix it all I had to do:
systemctl enable ceph-mon
on two machines.
Summary: make sure things work after reboot and don't do challaging stuff while your brain is all scrambled. -
2X Tickets solved by rebooting the server after .Net Installation.
Client Default answer to: "Did you turn it off and on again?"
Always silence until you force the reboot after an server check and analysis.2