Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "failed successfully"
-
--- GitHub 24-hour outage post mortem ---
As many of you will remember; Github fell over earlier this month and cracked its head on the counter top on the way down. For more or less a full 24 hours the repo-wrangling behemoth had inconsistent data being presented to users, slow response times and failing requests during common user actions such as reporting issues and questioning your career choice in code reviews.
It's been revealed in a post-mortem of the incident (link at the end of the article) that DB replication was the root cause of the chaos after a failing 100G network link was being replaced during routine maintenance. I don't pretend to be a rockstar-ninja-wizard DBA but after speaking with colleagues who went a shade whiter when the term "replication" was used - It's hard to predict where a design decision will bite back and leave you untanging the web of lies and misinformation reported by the databases for weeks if not months after everything's gone a tad sideways.
When the link was yanked out of the east coast DC undergoing maintenance - Github's "Orchestrator" software did exactly what it was meant to do; It hit the "ohshi" button and failed over to another DC that wasn't reporting any issues. The hitch in the master plan was that when connectivity came back up at the east coast DC, Orchestrator was unable to (un)fail-over back to the east coast DC due to each cluster containing data the other didn't have.
At this point it's reasonable to assume that pants were turning funny colours - Monitoring systems across the board started squealing, firing off messages to engineers demanding they rouse from the land of nod and snap back to reality, that was a bit more "on-fire" than usual. A quick call to Orchestrator's API returned a result set that only contained database servers from the west coast - none of the east coast servers had responded.
Come 11pm UTC (about 10 minutes after the initial pant re-colouring) engineers realised they were well and truly backed into a corner, the site was flipped into "Yellow" status and internal mechanisms for deployments were locked out. 5 minutes later an Incident Co-ordinator was dragged from their lair by the status change and almost immediately flipped the site into "Red" status, a move i can only hope was accompanied by all the lights going red and klaxons sounding.
Even more engineers were roused from their slumber to help with the recovery effort, By this point hair was turning grey in real time - The fail-over DB cluster had been processing user data for nearly 40 minutes, every second that passed made the inevitable untangling process exponentially more difficult. Not long after this Github made the call to pause webhooks and Github Pages builds in an attempt to prevent further data loss, causing disruption to those of us using Github as a way of kicking off our deployment processes (myself included, I had to SSH in and run a git pull myself like some kind of savage).
Glossing over several more "And then things were still broken" sections of the post mortem; Clever engineers with their heads screwed on the right way successfully executed what i can only imagine was a large, complex and risky plan to untangle the mess and restore functionality. Github was picked up off the kitchen floor and promptly placed in a comfy chair with a sweet tea to recover. The enormous backlog of webhooks and Pages builds was caught up with and everything was more or less back to normal.
It goes to show that even the best laid plan rarely survives first contact with the enemy, In this case a failing 100G network link somewhere inside an east coast data center.
Link to the post mortem: https://blog.github.com/2018-10-30-...6 -
Successfully moved my server across the big pond - or so I thought.
Turns out that Vultr has newly acquired a IP range that was belonging to a ISP in Greece. So far so good. But, it existed on 6-7 blacklists, Vultr had failed to delegate the network to their rDNS, and my domain suffered from DNSSEC ( fuck DNSSEC )
After two days of complaining to Vultr because they did not believe me they finally fixed their shit. My domain did start working again from some reason that I dont know and the blacklists is being removed one by one.
The Circus ended with a beer on the balcony, I like beer 🍻🍻🍻9 -
8 years later, i have successfully failed my first company. Good experience but for now burnt out.
To fellow devs who are considering entrepreneurship , don't be afraid, try it until you are young. This month I will lay off my last employees, luckily with bonus and 3 months worth of salary, i wish them best of luck. Difficult to explain but I am as happy in the end as i was in the start, just older.12 -
Why me. Why is it always me who has issues with Windows. (The OS)
I HAVE to use windows for a specific thing right now. Fair enough, I have an old system lying around somewhere with not the best specs ever but it'll do. Windows 7, clean install.
Firstly, let's boot up! Booting goes fine, login goes well... "Installing device drivers" (keyboard and mouse combi). I connected this set a gazillion times before so no clue why windows would need to download the drivers YER AGAIN. But, fine, it works.
Let's connect a USB webcam and to to the hardware testing website to see if my setup is right!
(I mostly don't blame this part on windows)
The webcam drivers install successfully, good. Although the page says it isn't working, it displays the live cam footage well so whatever.
Installed Chrome (not chromium too badly) to see if it shows fine there but chrome doesn't detect ANY cam/mic combination at all, not even the integrated one(s).
Annoying so let's reboot and see if it works normally with all checks okay on Firefox.
Rebooted.... aaaaand the USB webcam driver installation fails. I'm weirded out since the drivers were installed BEFORE the reboot already. Firefox now does not display any can/mic.... until it does after a few reloads. Windows is still saying that the driver installation failed.
The testing webpage, however, still says its not working while I'm literally seeing my ugly smug on screen. I contact support which does a remote check and says all is good but there was probably "a glitch with Windows" while the checks are still mostly red, I take a copy of the chat log just to be sure.
Now, I kinda want to shut this system down until the time I'll need it but I'm rather afraid that Windows is going to throw driver conundrums yet again and I simply *CANNOT* have this right now. So, I'm leaving this system on until I need it, and I'll pray windows plays along well.21 -
Shamless rant towards the shamless Cursey dude. 😫
So whole day I have been trying to pass a variable from laravel blade view to vue2 component file. All in seperate files. I know that I have successfully passed 1 or 0 in same flow before. So I was following the same steps to pass my string variable. It kept giving me undefined. No google helped and I had been doing all kinds of stupid useless trials. All failed.
Because it is supposed to fail. 😐
I only learned it at the end of the day.5 -
👍 https://github.com/auchenberg/...
"If you want your software to be adopted by Americans, good tests scores from the CI server are very important. Volkswagen uses a defeat device to detect when it's being tested in a CI server and will automatically reduce errors to an acceptable level for the tests to pass. This will allow you to spend less time worrying about testing and more time enjoying the good life as a trustful software developer."rant malice driven development devops task failed successfully volkswagen emissions continuous integration satire gone wrong troll10 -
Today I ended a coding session by fixing a problem I’ve been having but not the problem I was attempting to patch and I ended up screaming “SHIT NO I FAILED SUCCESSFULLY”
a friend of mine that over heard me was like “dude those aren’t words that work together”
I just replied with “you would think so but it’s more common than you think” -
Amazing! My joke functionality worked and the server delivered it's content successfully to the client. The joke: malloc is wrapped to give that message and retry IMMEDIATELY. Expectation is that the computer that was already in panic mode gets even fucked up more because it's out of memory. My malloc is literally while((ptr=malloc()} == NULL) { show_warning()
;malloc();}. Imagine if it didn't give that warning, I would've never known that a malloc failed. Who checks their freaking malloc result? You should, but i do not see much people doing it.
The previous crash on screen is what happens if you're doing a get instead of a post. I just declare my server app indestructible btw. Ffs!5 -
TL;DR my first vps got hacked, the attacker flooded my server log when I successfully discovered and removed him so I couldn't use my server anymore because the log was taking up all the space on the server.
The first Linux VPN I ever had (when I was a noob and had just started with vServers and Linux in general, obviously) got hacked within 2 moths since I got it.
As I didn't knew much about securing a Linux server, I made all these "rookie" mistakes: having ssh on port 22, allowing root access via ssh, no key auth...
So, the server got hacked without me even noticing. Some time later, I received a mail from my hoster who said "hello, someone (probably you) is running portscans from your server" of which I had no idea... So I looked in the logs, and BAM, "successful root login" from an IP address which wasn't me.
After I found out the server got hacked, I reinstalled the whole server, changed the port and activated key auth and installed fail2ban.
Some days later, when I finally configured everything the way I wanted, I observed I couldn't do anything with that server anymore. Found out there was absolutely no space on the server. Made a scan to find files to delete and found a logfile. The ssh logfile. I took up a freaking 95 GB of space (of a total of 100gb on the server). Turned out the guy who broke into my server got upset I discovered him and bruteforced the shit out of my server flooding the logs with failed login attempts...
I guess I learnt how to properly secure a server from this attack 💪3 -
Anyone in here successfully using a pure FP language/ecosystem on their day to day?
I know of one of you that uses Scala, and myself I have an (admittedly) shitty application at work running in Clojure. These last two languages I mentioned are not pure FP.
I am talking about the likes of PureScript, Haskell, etc. Those mfkas.
If so,what is your experience working in said paradigm? I tried to keep my Clojure program as pure as possible, I failed, but enjoyed it.
And I know that FP is not a silver bullet, but in some scenarios when properly applied it can work beautifully. I also have React based applications with pure components, but Javascript itself is neither a functional(pure or otherwise) programming language, it merely supports functional paradigms.
Just wondering, no flamewars or anything like that, I just want to know your pros and cons.6 -
The ridiculous and shameful story of how simply "installing Windows" saved my hard drive from the garbage.
(Also update on https://devrant.com/rants/3105365/)
It started with my root partition turning read-only all of a sudden. Some quick search suggested that I should check the sanity of my hard drive, by running a SMART test, which failed of course. I backed up my data using ddrescue and ran a badblocks over the whole thing, which found around 800 unreadable blocks in a row. I was ready to bid farewell to my drive, but as a last resort, instead of the trash, I brought it to this place where they claimed they can repair the damaged hard drives by "surgery".
To my surprise, they returned my drive the next week, saying it is all well now, and charged me 1/8 the price of a new drive, with a refund guarantee if there was a problem in two days. There was a problem right there: I ran another SMART test which failed again, and also the faulty blocks were still unreadable! So I stormed the place and called for my refund, showing the failed SMART report. The only answer I would get from the staff was "Have you tried installing Windows?".
I usually try to be patient in such situations; I really don't like to declare publicly that "not everyone uses that stinky piece of rotten software you call an OS", but their suggestion seemed totally irrelevant! I got all types of IO errors all over the damn thing and they told me to install Windows. Why? Because this was the only test they would rely on. At last I managed to meet the "technician" there and showed him the IO errors: tried to read the bad sectors with dd and failed. He first mumbled somethings like "Have you checked the connector?" or "Are these the same blocks?", but after he ran out of bullshit, he said "Why don't you just install Windows first and see if that helps?" and I was ready to explode in his face!
"You test drives by installing Windows, just because it will make a nasty NTFS partition and probably does an fsck? If you shut your mouth for a sec and open your eyes you'll see this is a shit load of IO errors we got here: You can't install Windows, you can't even make an NTFS here, because it will try to zero-the-fuck-out the damn partition and it will face the same fucking IO error that I'm showing you right now in almost one single fucking system call!"
"I don't know this kind of test you are using. We have our own tests and they've passed successfully. So all I can do is to give you a Windows CD if you want."
"I don't need a Windows CD. I will just try to make an NTFS partition on the error spot and I will fail."
"Ok. Then call me when your done."
I was angry, not only because I felt they're just trying to avoid a refund, but also because I knew I've lost my drive. But just with hope that I could get my money back, I made a small partition over the error spot and ran `mkfs.ntfs` on it. I was ready to show the failure to the guy, but I looked more precisely and saw that "the filesystem was created successfully!" I was sure something is nor write. I then successfully mounted the new partition, write over it and read it again. I even dd'ed the blocks again, and this time there was no IO error. All of a sudden everything was fine.
I didn't know what happened. Maybe it just needed a write, while I'd just tried to read from those blocks. But anyway, I didn't called the technician guy again. I just thanked one of the staff there and said that my problem was solved. I then ran a successful SMART test and then restored my backup. Ridiculous like that.
I'm still not sure if my drive will continue to live with no more problems. I also have no explanation for what happened. (I appreciate any help on this https://superuser.com/questions/...) But I really like to see the look on the poor guy's face when he finds out that trying to install Windows just saved my ass!11 -
Build attempt #582918 failed
Build attempt #582919 failed
Build attempt #582920 successful
Wait what... -
LINUX. I'm sure everyone heard this term. But I still don't know why do people want to give up their life and try this piece of crap. I know many of you might be offended, but, to hell with that. When I heard about the Linux, and everyone was praising it about it, I thought that I should give it a try. So, I installed Ubuntu (obviously, because I was a beginner) and the installation failed. I thought that I've made some mistake. Tried again, FAILED. So, I waited for next version. After downloading and trying to installing it, Voila. I installed it. Then comes the part when I actually started using it, for as simple as watching a video. I didn't play. It gave an error of some codec was missing. I installed the codec and then I payed the video successfully. Then, I want to install the Oracle Java Development Kit, and literally it was a pain to install. It took me half an hour to install and configure it. Then after using it for a couple of days, I found that my WiFi was acting weird. I booted up my Windows just to check it and it worked perfectly on windows. Then why the heck was it not working on Ubuntu. Don't know. On searching about it, I found that my WiFi adapter's driver was having some issues. Then after using it for more days, something very weird happens, the Ubuntu booted but with terminal only. No GUI, No Unity, nothing. I against searched for it, found some commands, ran it and it started normally. So, the point that I'm trying to make is that even for simple and basic tasks, I always have to search about it every time to get it working. I mean if their are so many steps to be taken for every simple task then why people keep on recommending it. With the Linux installed, I was very much distracted from my primary work. Instead of doing my work I was searching for installing JDK. I mean wtf. In Mac or Windows its as simple as downloading the file, installing it and you're done. But in Linux I don't know. And the whole Linux community thinks that Windows sucks. I mean on windows I was more relaxed and more focused on my work. Whenever we search for the Linux, many people say that Android is a Linux. I get it, but in Android, many developers have worked very hard to make it as what it is nowadays. But what about Ubuntu, Fedora or any other distribution. I haven't seen any distribution which makes me feel that I wanna use it again. None of them. So, Linux is not a great OS according to my experience11
-
FFS! having nodejs server on heroku, added certificate successfully for https, yet when going to www.example.com it uses http on prod and maintanence page while example.com goes to https.
All my attempts to catch http connection failed.
This is the definition of me wanting to bang my keyboard and problem autosolves itself while I am doing it!
Where is the my one click and everything is ready. I want to code back end and front end not spend 2 days trying to figure out https bullshit for unknown reason. -
This one time last year a colleague found out that some data went missing and suggested to recover the data from a backup. When trying to create a new database instance in the Google Cloud Platform (if everything works it's amazing!) it failed.
Not knowing why this happened, I tried to revert that backup to the production database, after creating a backup using the GCP. Needless to say that failed as well, resulting in a corrupted database instance where I couldn't access the created backups anymore.
This all went at around 10pm and the only users of our product are currently in the same timezone and use it from around 7.30AM until 6PM so no one besides our team knew the server was down.
After a long night chatting Google's support team the database was successfully recovered and the only harm done was sleep depravation for me and a colleague.
Apparently there was a bug in the GCP. It was resolved in two hours and the last time a breaking bug was in that piece was more than seventy days earlier.
I did at least learn to create local backups as well, instead of relying on the tools of the same product...
Best: the moment I saw the corrupted database spin up again and not losing my job because of it. -
Python User-
C:\WINDOWS\system32>pip install scikit-image
CMD/Bash-
Collecting scikit-image
Downloading scikit-image(12.6MB)
██████████████████████████████
Collecting numpy
Downloading numpy(1.3MB)
██████████████████████████████
Collecting matplotlib
Downloading matplotlib(1.3KB)
██████████████████████████████
Collecting decorator
Downloading decorator(6.8MB)
██████████████████████████████
Collecting imageio
Downloading imageio(3.6MB)
██████████████████████████████
Collecting cycler
Downloading cycler(2.9MB)
██████████████████████████████
Installing collected packages cycler, imageio, decorator, matplotlib, numpy, scikit-image
Successfully installed cycler, imageio, decorator, matplotlib, numpy
Failed to load DLL of scikit-image.
C:\WINDOWS\system32>2 -
WHYY , are you fucking fucking complaining, mother fuckdr yyuo fucking won
You completed our mission objective successfully
You fucking did it mother fhcker and what ur asking from me after all of thus shit weve been through for the past 7 months is beyond our primary mission objective ,fucker
Obviously as you can fuckin see from the 7 months of suffering we can not repeat the same objective twice, just like u cant be born or die twice, fucker
Shit happens once and thats goddamn fuckin it motherfucker move on to tje ffckin next mission objective that i command u to go towards
NO FAILED MISSIONS. I ONLY BROADCAST SUCCESS. BUT SHIT HAPPENS RARE.
So forget about her u motherfucker, you told me what you wanted to achieve, i planned out the whole scenario, i organized the mission objective for you and you have took the fuckin risk and and action and guess what u fuckin succeeded. My mission objective has never failed you. What you are trying for these fuckin past 7 months is not my mission objective and it is out of scope, unplanned fuckin shit and that is why u fell back into fuckin depression i told u to fuckin stay away but u aint to me listen fucker
Stop.
Breathe.
Worry no more about the shit that is irrelevant and out of your fuckin control.
U got friends at college. Hang out with them ull feel better. Whwnever u think of that fuckin whore goo mothrrfuckr and meet ur goddamn fckin irl friends. Text them. Shit man.....
Good luck2 -
// Rant 1
---
Im literally laughing and crying rn
I tried to deploy a backend on aws Fargate for the first time. Never used Fargate until now
After several days of brainwreck of trial and error
After Fucking around to find out
After Multiple failures to deploy the backend app on AWS Fargate
After Multiple times of deleting the whole infrastructure and redoing everything again
After trying to create the infrastructure through terraform, where 60% of it has worked but the remaining parts have failed
After then scraping off terraform and doing everything manually via AWS ui dashboard because im that much desperate now and just want to see my fucking backend work on aws and i dont care how it will be done anymore
I have finally deployed the backend, successfully
I am yet unsure of what the fuck is going on. I followed an article. Basically i deployed the backend using:
- RDS
- ECS
- ECR
- VPC
- ALB
You may wonder am i fucking retarded to fail this hard for just deploying a backend to aws?
No. Its much deeper than you think. I deployed it on a real world production ready app way.
- VPC with 2 public and 2 private subnets. Private subnets used only for RDS. Public for ALB.
- Everything is very well done and secure. 3 security groups: 1 for ALB (port 80), 1 for Fargate (port 8080, the one the backend is running on), 1 for RDS postgres (port 5432). Each one stacked on top and chained
- custom domain name + SSL certificate so i can have a clean version of the fully working backend such as https://api.shitstain.com
- custom ECS cluster
- custom target groups
- task definitions
Etc.
Right now im unsure how all of this is glued together. I have no idea why this works and why my backend is secure and reachable. Well i do know to some extent but not everything.
To know everything, I'll now ask some dumbass questions:
1. What is ECS used for?
2. What is a task definition and why do i need it?
3. What does Fargate do exactly? As far as i understood its a on-demand use of a backend. Almost like serverless backend? Like i get billed only when the backend is used by someone?
4. What is a target group and why do i need it?
5. Ive read somewhere theres a difference between using Fargate and... ECS (or is it something else)? Whats the difference?
Everything else i understand well enough.
In the meantime I'll now start analyzing researching and understanding deeply what happened here and why this works. I'll also turn all of this in terraform. I'll also build a custom gitlab CI/CD to automate all of this shit and deploy to fargate prod app
// Rant 2
---
Im pissing and shitting a lot today. I piss so much and i only drink coffee. But the bigger problem is i can barely manage to hold my piss. It feels like i need to piss asap or im gonna piss myself. I used to be able to easily hold it for hours now i can barely do it for seconds. While i was sleeping with my gf @retoor i woke up by pissing on myself on her bed right next to her! the heavy warmness of my piss woke me up. It was so embarrassing. But she was hardcore sleeping and didnt notice. I immediately got out of bed to take a shower like a walking dead. I thought i was dreaming. I was half conscious and could barely see only to find out it wasnt a dream and i really did piss on myself in her bed! What the fuck! Whats next, to uncontrollably shit on her bed while sleeping?! Hopefully i didnt get some infection. I feel healthy. But maybe all of this is one giant dream im having and all of u are not real9 -
Following some new nextjs tutorial to learn how to efficiently build a web chat app, the guy built it very solid, but is it efficient?
Im having mixed feelings about this approach. The way he did it is, for example when you click on a user (imagine it as a list of users from your contacts), it actually calls a route, which stores that in database, and once its done Then the route triggers lets say socket.io event to notify the frontend to update the UI.
Not only that but each new message that gets sent it actually calls a route which stores that message in database and once that's successful Then it emits a socket.io event to the frontend to fetch that message.
As you can imagine constantly calling routes like this Does induce small delays. Creating conversations, navigating, opening someones profile and especially sending messages, is NOT instantaneous. When you do it theres a small delay, giving the impression as if the app is SO large that it lags
But it doesnt lag, it just needs a few ms to store that in db so it can return the socket.io bidirectional message event. Which does make sense because what if the internet broke and the user immediately gets sent a message, but the message fails to get stored in database? Or db storage gets fucked or something else fails but socket.io works while db doesnt? The data then may be inconsistent. This approach fulfulls the single source of truth principle
So thats why im having mixed feelings about this approach particularly because of small delays. It is not instantaneous like whatsapp discord telegram signal viber etc the input UI freezes until the message is successfully sent
---
Of course this can be a UI/UX decision and can be handled differently even if the backend works like that.
My concern is is this approach valid?
My question is... I had an idea what if i emit socket.io event to send the message while in the background also call the route to store that message in db? This way not only would it work asynchronously but the message gets sent instantaneously, and if the backend fucks up to store it in db then the UI gets updated with message failed to get delivered, switching the socket.io into polling state. Is this a good (proper, efficient, better) way to do it or not?8 -
ENOSPC = random things go wrong.
There are many synonyms for ENOSPC, like "disk full", "space storage full", "space storage exhausted", "no more space left on device", and those other repulsive errors. For the sake of simplicity, I am going to refer to it as ENOSPC.
If you are in this condition on the operating system partition, get out of it quickly or random things will go wrong. Text editors which write directly to a text file rather than creating a temporary file and then replacing the text file could end up blanking the text file, softwares' configuration files might fail saving which causes a reset, and web browsers might spontaneously reset cookies and lose history.
For example, Firefox has created a gap in the web browsing history, as shown here. The history that is now memory-holed initially appeared to have been recorded successfully. Apparently, a failed write to the places.sqlite database when closing the browser created this gap.4