Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "hardware failure"
-
Because of hardware failure we had to move some vpns from one datacenter to another.
The team of highly untrained monkeys at my hosting provider were hired to do this. First they ran backups of all the systems. Then they started the moving process. A few hours later they were done. We got an email everything was back online.
So we restarted all our processes and no data was coming in from our Raspberry's around the country. So we start a little investigation. What did these buffons do, they changed our rsa keys.
So we kindly ask them to put the old keys back so we do not have to fix 200 changed key warnings on systems that are not remotely accesible.
Apperently something that can't be done because their back up process is automated and always makes new keys.
Holy fucking fuck, whats the point in having a backup its not an exact copy. Is this fucking normal?
Now I will be spending the next few weeks literally standing in cow shit reconnecting Raspberry's.
Thanks a fucking lot. Not!4 -
I closed the lid of my laptop and went for a lunch. When I came back the damn machine wouldn't start and I realised I didn't push anything to github today so wish me luck recovering data from the drive.
(At least lunch was good.)6 -
5 stages of failing WIFI connectivity on Linux
This morning I woke up my laptop to start my work day. I have 2 very important meetings today, so I better get all prepared.
"Wifi connection failed"
Syslog says:
- wpa_supplicant: wlp9s0: SME: Trying to authenticate with <MAC>
- kernel: wlp9s0: authenticate with <MAC>
- kernel: wl9s0: send auth to <MAC> (try 1/3)
- kernel: wl9s0: send auth to <MAC> (try 2/3)
- kernel: iwlwifi: Not associated and the session protection is over already...
- kernel: wl9s0: send auth to <MAC> (try 3/3)
- kernel: wl9s0: authentication with <MAC> timed out
#### DENIAL #####
No biggie, let's try another AP (I have 3). All 3 failed to connect. Fine, let's try my phone's hotspot! FAILED!!!!!
w00t.... okay, let's restart the router... but failing to connect to a phone hotspot is already a worrying sign.
Wifi connection failed
wtf.. disable and re-enable wifi
Wifi connection failed
#### ANGER #####
the fuuuuuuck. Maybe my router is dead. But my phone connects to it, no fuss. My personal lappy also connects there easily.
wtf... Does that mean I'm about to lose my uptime?? Come one!! It's Linux - there MUST be something I could do! I don't see processes hanging in D state so the radio must be fine - it's gotta be a software issue!
ChatGPT – type all the log entries manually, via phone (that took a while...). Nothing useful there: update firmware, restart NetworkManager, etc.
#### BARGAINING #####
Alright... How about a USB dongle? Plug it in and wifi connects immediately! Yayyy!!! But that's only b/g/n and I'd very much like to have ac. It works well as a limping backup, but not something I'd use for the meetings.
rfkill block/unblock all the radios. No change. USB dongle connects right away but the PCIe adapter keeps throwing notifications at me with failure messages. It's annoying, to say the least.
So I've already tried
- restarting the router(s)
- disabling/reenabling the radios
- multiple APs
- suspending/waking again several times
- praying
#### DEPRESSION #####
The only thing I haven't tried yet is the most cruel one - restarting the laptop. But that's unfair... It's LINUX! How could it disappoint me. I have so many tmux sessions open, so many unsaved leafpad notes, terminal histories with oh so comfy ^r and ! retriggers all ready and waiting to be executed...
#### ACCEPTANCE #####
But I can't miss the meeting. So I slowly start closing off apps, starting with the least important ones, trying to preserve as much history and recent commands as I can. I'm gonna lose my uptime, that's the inevitable obvious truth... Linux has failed me. Or maybe it's a hardware issue... I can't be sure until I restart.
I must reboot.
#### A NEW HOPE #####
Hold on.. What if... What if before restarting I try to reload the Intel wifi kernel module? Just for the giggles. I've got nothing to lose anyway...
rmmod iwlmvm
rmmod iwlwifi
modprobe iwlwifi
modprobe iwlmvm
*WiFi Connected*
YESSSS!!!!!!!!! My uptime is saved!
403 days and counting! YEAH BABY!!!
Linux is the best!rant sysadmin 5 stages of grief wifi reboot or not reboot reboot uptime network-manager wpa_supplicant linux8 -
Due to hardware failure build server was wiped out.
So all configurations are gone.
- There is no code to recreate builds
- No backups :D
The bright side, we can finally do it right ;)2 -
I finally fucking made it!
Or well, I had a thorough kick in my behind and things kinda fell into place in the end :-D
I dropped out of my non-tech education way too late and almost a decade ago. While I was busy nagging myself about shit, a friend of mine got me an interview for a tech support position and I nailed it, I've been messing with computers since '95 so it comes easy.
For a while I just went with it, started feeling better about myself, moved up from part time to semi to full time, started getting responsibilities. During my time I have had responsibility for every piece of hardware or software we had to deal with. I brushed up documentation, streamlined processes, handled big projects and then passed it on to 'juniors' - people pass through support departments fast I guess.
Anyway, I picked up rexx, PowerShell and brushed up on bash and windows shell scripting so when it felt like there wasn't much left I wanted to optimize that I could easily do with scripting I asked my boss for a programming course and free hands to use it to optimize workflows.
So after talking to programmer friends, you guys and doing some research I settled on C# for it's broad application spectrum and ease of entry.
Some years have passed since. A colleague and I built an application to act as portal for optimizations and went on to automate AD management, varius ssh/ftp jobs and backend jobs with high manual failure rate, hell, towards the end I turned in a hobby project that earned myself in 10 times in saved hours across the organization. I felt pretty good about my skills and decided I'd start looking for something with some more challenge.
A year passed with not much action, in part because I got comfy and didn't send out many applications. Then budget cuts happened half a year ago and our Branch's IT got cut bad - myself included.
I got an outplacement thing with some consultant firm as part of the goodbye package and that was just hold - got control of my CV, hit LinkedIn and got absolutely swarmed by recruiters and companies looking for developers!
So here I am today, working on an AspX webapp with C# backend, living the hell of a codebase left behind by someone with no wish to document or follow any kind of coding standards and you know what? I absolutely fucking love it!
So if you're out there and in doubt, do some competence mapping, find a nice CV template, update your LinkedIn - lots of sources for that available and go search, the truth is out there! -
We started a project in January for which I was the sole developer, to automate tedious interaction with a vendor's ticketing system. We have a storage environment with about 400,000 commodity disks attached(for this vendor-- there are other vendors too), in sites around the US and Canada. With a weekly failure rate of about 0.0005%, that means about 200 disks a week need to be replaced.
This work-- hardware investigation through storage appliance frontends, internal ticket creation, external ticket creation, watching the external ticket for updates to include in our internal ticket --was all manual, and for around 200 issues a week, it was done by one guy for two years. He was hopelessly behind. This is all automated now, and this morning, I pushed this automation from dev/test to production.
It feels great to see your work helping people around you.8 -
This is the craziest shit... MY FUCKING SERVER JUST SET ON FIRE!!!
Like seriously its hot news (can't resist the puns), it's actually really bad news and I'm just in shock (it's not everyday you find out your running the hottest stack in the country :-P)... I thought it slow as fuck this morning but the office internet was also on the fritz so I carried on with my life until EVERYTHING went down (completely down - poof gone) and within 2 minutes I had a technician from the data centre telling me that something to do with fans had failed and they caught fire, melted and have become one with the hardware. WTF? The last time I went to the data centre it was so cold I pissed sitting down for 2 days because my dick vanished.
I'm just so fucking torn right now because initially I was absolutely fucking ecstatic - 1 week ago after a year of doomsday bitching about having a single point of failure and me not being a sysadmin only to have them look at me like I'm some kind of techie flat earther I finally got approval to spend around 5x more per month and migrate all our software to containerized micro services.
I'll admit this is a bit worse than I expected but thanks to last week at least I have recent off site images of the drives - because big surprise I have to set this monolithic beast back up (No small feat - its gonna be a long night) on a fresh VPS, I also have to do it on premises or the data will only finish uploading sometime next week.
Pro Tip: If your also pleading for more resources/better production environment only to be stone walled the second you mention there's a cost attached be like me - I gave them an ultimatum, either I deploy the software on a stack that's manageable or they man the fuck up and pay a sys admin (This idea got them really amped up until they checked how much decent sys admins cost).
Now I have very flexible pockets because even if I go rambo the max server costs would only be 15-20% of a sys admins paycheck even though that is 13 x more than our current costs. -
We had a major core router hardware failure in our LA datacenter today and every one of our services has been down since 6am, including all production servers. We have about 15,000 sites down across our entire platform. Our manager came over and told us to just go home because we need to replace the hardware and the process is expected to take all day, and we can't do any work until then because all the production servers are down. So you could say that it's been a pretty easy Friday so far! I'm headed home to play Spider-Man2
-
Customer complains about an issue after a software update. The head of department himself tested the update and got an error message.
Me looking at the logs. Ok, that's an issue, but based on hardware failure, customer should fix his hardware, no relation to the new software.
But surprisingly close to the software update, which piques my curiosity.
Me looking at older logs ... same issue. EVERY FUCKING DAY. For months. The corresponding error message only appears if a user is logged on, so quite a few people have seen it. Obviously nobody cared. Maybe we just ditch error messages, it'll save lots of work. -
So today my company was removing most workspaces with USB 2 connections, DP cables and magsafe 2 power cables. This means that my MBP mid 2014 can't connect to the keyboard and monitors anymore. It already struggled with 4K, so my 2K options were already limited, but now the last few spots are mostly gone. In short: I'm being forced to upgrade.
But tell you what: I don't want to. It feels like a waste to recycle my laptop (even if it's company paid and owned) while it's perfectly acceptably fine. And mind that I will get the latest and greatest i9 for free. Yes, that overheating, throttling failure of hardware design piece of shit. 2 coworkers already own the beast and confirm that it gets really hot really quickly. One of them even has daily crashes (the laptop just turns off) and random reboots. A total waste of money. And my future time. As if it's not enough work to migrate to a new laptop (even with Time Machine).
So, fellow ranters, what do I do? I hope I can leverage the second best MBP (CPU-wise) from this situation, unless there already is a bunch of i9s in the office ready to be used. I really, really don't want one. And I think my current computer is great for what it is, even if it's old. It's a really pro machine for my needs (I'm very efficient, except for Android Studio).
I even consider asking for a Linux machine, but then a whole new world opens to me that may be a step too big (since I barely have hands-down experience).
Enlighten me with your ideas, muggles!5 -
How do you test unreachable code or part that is considered an edge case?
For example I catch exceptions in case IO failed and data was not written on database, but that only happens if hardware failure, or no disk space left, how do I mimic that?
I also have unreachable code for example, in one layer I fetch data (lets call it function x) and always return success result unless item not found I throw KeyNotFound exception. But in the calling function I handle the case of Status == Failed
Just in-case in the future I change function x and start returning failed status, so my logic already written but never reachable14 -
grrrr
last week my laptop died out of nowhere. it stopped recognizing the one drive in it. I lost a bunch of files, code. evidently ssds fail out of nowhere unlike hdds which slow down and error all the time before ultimate failure
my warranty for this 4k$ laptop expires in 12 months and this was month 13. nice. I don't like warranties anyway, and the site said they would replace things with "comparable hardware, sometimes refurbished" wtf no thanks
so I found some guides of people upgrading the drive in this laptop. seemed easy enough, unlike older laptops from back when I was in school where you had to take out 12 things first to get to anything
unfortunately I needed a specific screwdriver. I walked several miles to the nearby hardware store thinking they would have said screwdriver. the old guy in the basement said there was a kit where it started from t4 (I needed t5), but he had just sold out his last one. I checked their online store with a friend for a while on my way back home and we kept finding torx screws but the wrong sizes. fuck.
he said screwdrivers this small are only used for electronics, asked if there's any other hardware stores and there aren't near me
however it occurred to me this strip mall has a lot of suspicious computer stores on it. so I walked back up the street looking for one.
found one with a suspicious poster, saying it was an internet cafe but the last point on their poster said they do repairs. walked in. nobody is in there, suspiciously 2 desks with old computers all empty, then you go forward in this dark cave, with plastic wrapped implements on the walls, you finally find a glass shield and behind it was a meek Asian man that took me a moment to notice
I asked him if he had t5
he handed me a plastic baggy full of tiny screwdrivers, for me to take one
I asked if they're t5
the shape looked right, but I can't tell the size
I took one out and tried to find size marking, but nothing
he didn't seem to know what I was asking when I asked about its size
he said if it's wrong I can come back and trade what I took for another. lol
I asked him if I can buy it, since that wasn't evident to me due to how sus this random bag of screws is being thwarted on me lmao
he said 5$ cash
I gave him a fiver
this sus shop literally avoiding taxes lmao
walked back home, ate food cuz starving, tried the screw and FUCK, it's too big. put laptop in a bag and hauled ass fast, checked on maps the store I got this from closes in a few minutes so I really wanted to make it there because what if the receptionist changes and they don't know I took this screw. I got no receipt
got there right before closing, put my laptop down, said it was too big. he used a few screws until he found one that fit, said I could try it and I did (so scam aware!). bingo bango. now I got a screwdriver that fits the laptop.
walked home, sat down and took apart the laptop. been a few years since I did so. the hardware inside looks entirely unrecognizable to me. started cycling through YouTube videos of laptops of the same name as mine, but their insides don't look like mine. is this ram? is this the NVMe? what the fuck is anything?
finally found a video guide where the guy was quite informative. not the same laptop but he's informative enough I figure it out. ram and drives are so different and weird now. took parts out, put them back in, rebuilt laptop, tried to boot, same problem. jiggling parts like this works with desktops often, guess not with a failed NVMe
so I'm screwed. get on Newegg and bought a new NVMe. should arrive in 3 days via Purolator
yesterday was day 3. it was at a sort facility near me, then out on delivery, but nobody ever came. then it went back to sorting. now it's out on delivery again. I'm sitting here thinking that's a little weird, wasn't Purolator the delivery company that had me go 2 hours outside of town to pick up a 15lb desktop case once?
... and then I looked up Reddit comments... then reviews on the purolator facility it's at... I am screwed. last time iirc they were out for delivery for 3 days, never tried delivery, then on the last day at the end of day they stated they attempted delivery but no go. that was bullshit. then it ended up at that facility. which takes 2 hours to fucking reach.
the reviews are so bad... the facility has 1.2 star reviews with thousands of them. they won't leave even a stub, then seem to not know where your package is at the facility, or they deny you have the right to pick it up despite ample IDs, or someone ELSE picks it up and it's not there. they also ship your package back after 5 days, so if they don't leave a note and you miss it tough luck...
fucking hell
also rumours that they just hire "contractors" in normal cars to drop off packages? wat? lol
AND EVERY REVIEW HAS A BOT COMMENT. THEIR SUPPORT IS JUST A CHATBOT
I thought this was just a small hiccup
I think I might not have a drive for weeks now
fucking hell
now I'm sitting on my porch2 -
Thought it had been a while since my last Arch redo. Now the fsck output is hundreds of pages long and I'm grateful for my backups, and all suspiciously occuring just seconds after a full upgrade. I guess hardware failure is a possibility, but the smart status on the drive says it's perfect and dozens of retrospective self tests didn't reveal any issues. RAM passed multiple tests as well. Oh well, not like I haven't reinstalled before.
-
I hate the company (agency) I moved to...I've negotiated good pay and the project for cutting edge medical product which will change the world (cancer diagnose and it actually works).
Now the dark side I've got shit tier laptop which I don't want, overtime is payed 30% less, all the people in the agency from development team don't know shit and are mostly I would call them juniors (of course who would with enough seniority work with shit hardware and almost not payed overtime), only tap water and since this is the old part of town you instantly get sick, they treat people like shit.
The product dark side. We are actually working on crm for doctors to input patient data, we cannot have any real data because we are the agency people, product is being led by the guy who has 0 production experience (they choose the database basically with coin toss and emulated the mongodb in postgress with jsnob, they don't know how to build their own auth system hence my previous rant about b2c, they are using cognito and now moving to auth0 which probably won't fit their need because a lot of stuff needs to be custom), they are choosing every hipe tech out there without any prior experience. It's chaos...
I'm trying to guide them but i think this will be a huge expensive failure and that i need to leave asap.
There I feel better now, moral of the story, choose startups wisely.1 -
UWP suck, I don't wanna hurt yall feeling but it's time to face the truths:
+ SandBox
+ Less Job Offer
+ Development more Complicated than Web App
+ Microsoft not create perfect hardware to make sure our app get to more consumers (the Pro X is failure)
+ Poor Optimized
Poor Optimized ?
the Windows 10 optimization is joke, all my surface laptop, pro, book I have tested. They claim that consume less Ram, but when using it along side electron and Win32 app. It feel so much choppy and lag. I mean WTF ?
UWP was made for optimize low specs SoC such as ARM base, now my laptop running on a core I5 + GPU still lag ??
I'm sorry but this is just sad. Im moving back to win32. WinRT sooner or later will end supported
And Microsoft will improve the Win32 Api6 -
Google Exoplayer had a bug on some devices when initializing AudioTrack.
Their fix: Just retry on init failure.
So if some device decides to fail more often than Google predicted: Boom!
Exoplayer is used by rarely known apps like Youtube.
And the best of it: The one to blame is not Google or Android. The ones to blame are most likely the Hardware vendors, who think that a custom android for every whatever is a good idea. -
https://goo.gl/photos/...
Client's PC. Sure I can fix, but we'll have to replace nearly everything. Better off buying a new one.1 -
Anybody know anything about this laptop:
https://newegg.com/p/...
I want to upgrade my current rig and want it to be an epic update. However, I found one review and it had a hardware failure.
Is there a better place to buy laptops that have power (GPU, CPU), but don't die from overheating?8