Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "production issue"
-
Client: There is a high severity production issue.. you need to fix urgently..
Developer: I am on the way.. Will fix it once I reach home.
Client: I don't care where you are. Fix it right now😡😡
Developer14 -
Big event. Massive traffic in production, so we were monitoring all night.
I was in a room with 2 devs of my team, a marketting girl, my boss and a designer... chilling.
Suddenly the production is down.
Boss: production is down, anyone can check?
Me: already on it
Dev1: it looks ok for me
Dev2: me too
Me: wait what? Impossible everything is down
Dev1: oh I refreshed the page it's not working
Me: don't stay on the page refreshing it like you are fucking monkeys. Give me useful intel or be quiet.
Market girl: is it working?
...
Guys is it working?
...
Hello?
Me: Not yet we are looking. Don't distract me.
Boss: client called us. They want it online now.
Dev1&2: he's looking
... 1 min later...
Boss: is it working?
Boss: is it working?
Boss: is it working?
Me: SHUT THE FUCK FOR FUCKING ONE SECOND. ALL OF YOU, OUT NOW. YOU ARE FUCKING MONKEYS WHO CAN'T DO SHIT. IF YOU CAN'T HELP JUST SHUT YOUR DAMN SHITHOLE. DEVS, LOOK WITH ME. MARKET GIRL PREPARE A FUCKING POST-MORTEM MAIL. BOSS GET THE CLIENT ON THE PHONE AND STALE. DO. YOUR. FUCKING. JOBS.
That's how I ended up screaming at everyone... the rest of the night went in complete silence and I fixed the issue 2min after the got quiet or busy.24 -
I worked with a good dev at one of my previous jobs, but one of his faults was that he was a bit scattered and would sometimes forget things.
The story goes that one day we had this massive bug on our web app and we had a large portion of our dev team trying to figure it out. We thought we narrowed down the issue to a very specific part of the code, but something weird happened. No matter how often we looked at the piece of code where we all knew the problem had to be, no one could see any problem with it. And there want anything close to explaining how we could be seeing the issue we were in production.
We spent hours going through this. It was driving everyone crazy. All of a sudden, my co-worker (one referenced above) gasps “oh shit.” And we’re all like, what’s up? He proceeds to tell us that he thinks he might have been testing a line of code on one of our prod servers and left it in there by accident and never committed it into the actual codebase. Just to explain this - we had a great deploy process at this company but every so often a dev would need to test something quickly on a prod machine so we’d allow it as long as they did it and removed it quickly. It was meant for being for a select few tasks that required a prod server and was just going to be a single line to test something. Bad practice, but was fine because everyone had been extremely careful with it.
Until this guy came along. After he said he thought he might have left a line change in the code on a prod server, we had to manually go in to 12 web servers and check. Eventually, we found the one that had the change and finally, the issue at hand made sense. We never thought for a second that the committed code in the git repo that we were looking at would be inaccurate.
Needless to say, he was never allowed to touch code on a prod server ever again.8 -
A former colleague made an online shopping app. Boss wanted to promote him to Senior Developer when he still working with us.
14 days ago another colleague checked the code and told the boss that it's ready for production. No one asked me because everyone in the company thinks am the stupid developer of them all.
So what happened?
Well the total value of the cart was being over to payment gateway using a hidden field. Well you know the rest of the story.
The client has sued our company for this issue and boss came running to me and asked me to check if it was our fault or something else.
I checked and found the hidden value where the total value of cart was being stored and send over to payment gateway. The following is the conversation between me and the colleague who checked the code:
Me: So you checked the code and everything was okay?
Him: Yes, all good.
Me: Did you see this hidden field where the total value of cart is being passed to the payment gateway?
Him: Yes
Me: Why didn't you fix this?
Him: What's there to fix?
Me: Well someone can temper the value and let it pass to the payment gateway.
Him: No, they can't we are using https
Me: I' am done with you
He has Masters in software engineering and has few security certificates.25 -
A wild Darwin Award nominee appears.
Background: Admins report that a legacy nightly update process isn't working. Ticket actually states problem is obviously in "the codes."
Scene: Meeting with about 20 people to triage the issue (blamestorming)
"Senior" Admin: "update process not working, the file is not present"
Moi: "which file?"
SAdmin: "file that is in ticket, EPN-1003"
Moi: "..." *grumbles, plans murder, opens ticket*
...
Moi: "The config dotfile is missing?"
SAdmin: "Yes, file no there. Can you fix?"
Moi: "Engineers don't have access to the production system. Please share your screen"
SAdmin: "ok"
*time passes, screen appears*
Moi: "ls the configuration dir"
SAdmin: *fails in bash* > ls
*computer prints*
> ls
_.legacyjobrc
Moi: *sees issues, blood pressure rises* "Please run list all long"
SAdmin: *fails in bash, again* > ls ?
Moi: *shakes* "ls -la"
SAdmin: *shonorable mention* > ls -la
*computer prints*
> ls -la
total 1300
drwxrwxrwx- 18 SAdmin {Today} -- _.legacyjobrc
Moi: "Why did you rename the config file?"
SAdmin: "Nothing changed"
Moi: "... are you sure?"
SAdmin: "No, changed nothing."
Moi: "Is the job running as your account for some reason?"
SAdmin: "No, job is root"
Moi: *shares screenshot of previous ls* This suggests your account was likely used to rename the dotfile, did you share your account with anyone?
SAdmin: "No, I rename file because could not see"
Moi: *heavy seething* so, just to make sure I understand, you renamed a dotfile because you couldn't see it in the terminal with ls?
SAdmin: "No, I rename file because it was not visible, now is visible"
Moi: "and then you filed a ticket because the application stopped working after you renamed the configuration file? You didn't think there might be a correlation between those two things?"
SAdmin: "yes, it no work"
Interjecting Director: "How did no one catch this? Why were there no checks, and why is there no user interface to configure this application? When I was writing applications I cared about quality"
Moi: *heavy seething*
IDjit: "Well? Anyone? How are we going to fix this"
Moi: "The administrative team will need to rename the file back to its original name"
IDjit: "can't the engineering team do this?!"
Moi: "We could, but it's corporate policy that we have no access to those environments"
IDjit: "Ok, what caused this issue in the first place? How did it get this way?!"
TFW you think you've hit the bottom of idiocy barrel, and the director says, "hold my mango lassi."27 -
The Perfect Storm:
My worst coding mistake? Yeah, let me tell you about that. I pushed a simple JavaScript/HTML change without knowing that the stupid header was shared with another "not so important" section of the site called "My Account" where people go to pay for their services. I call it the perfect storm because I left early that Friday for a weekend cruise and right before leaving I pushed the change, sent the request to push for production and left. When they noticed that clients were complaining about not being able to pay they started reversing most changes of all teams trying to fix it but they never touched mine because they knew I wasn't working on the backend. My whole team worked over the weekend trying to find the issue while I was having fun in the cruise. They ended up reversing all changes by Sunday night and it took us about 4 more days to figure out that my simple JavaScript/HTML change broke the site and prevented 30 million customers from making payments that weekend plus it broke the whole 2nd release of the month.... yeah, nothing major.21 -
string excuses[]={
"it's not a bug it's a feature",
"it worked on my machine",
"i tested it and it worked",
"its production ready",
"your browser must be caching the old content",
"that error means it was successful",
"the client fucked it up",
"the systems crashed and the code got lost" ,
"this code wont go into the final version",
"It's a compiler issue",
"it's only a minor issue",
"this will take two weeks max",
"my code is flawless must be someone else's mistake",
"it worked a minute ago",
"that was not in the original specification",
"i will fix this",
"I was told to stop working on that when something important came up",
"You must have the wrong version",
"that's way beyond my pay grade",
"that's just an unlucky coincidence",
"i saw the new guy screw around with the systems",
"our servers must've been hacked",
"i wasn't given enough time",
"its the designers fault",
"it probably won't happen again",
"your expectations were unrealistic",
"everything's great on my end",
"that's not my code",
"it's a hardware problem",
"it's a firewall issue",
"it's a character encoding issue",
"a third party API isn't responding",
"that was only supposed to be a placeholder",
"The third party documentation is wrong",
"that was just a temporary fix.",
"We outsourced that months ago.","
"that value is only wrong half of the time.",
"the person responsible for that does not work here anymore",
"That was literally a one in a million error",
"our servers couldn't handle the traffic the app was receiving",
"your machines processors must be too slow",
"your pc is too outdated",
"that is a known issue with the programming language",
"it would take too much time and resources to rebuild from scratch",
"this is historically grown",
"users will hardly notice that",
"i will fix it" };11 -
If all you have is a hammer, everything looks like a nail!
This was something which my tech lead used to tell me when I was so obsessed with nosql databases a few years back. I would try to find problems to solve that has a use case for nosql databases or even try to convince me(I didn’t realise it back then) that I need to use nosql db for this new idea that I have, without really thinking deep enough whether the data in question is better represented using an sql schema or not.
Now, leading a team of young developers, I come across similar suggestions from few of my team members who just discovered this new and shiny tech and want to use it in production projects.
While I am not against new and shiny, it’s not a good practice to jump right in to it without exploring it deep enough or considering all the shortcomings. The most important question to ask is, whether some of the problems you are trying to solve can be solved with the current stack.
Modifying your stack requires more than just a week’s experience of playing around with the getting started guide and stack overflow replies. This is something which need to be carefully considered after taking inputs from the people who would be supporting it, that include operations, sysadmins and teams that are gonna interface with your stack indirectly.
I am not talking about delaying adoption by waiting for long list of approvals to get some thing that would bring immediate value, but a carefully orchestrated plan for why and how to migrate to a new stack.
Just because one of the tech giants made a move to a new stack and wrote about it in their engineering blog doesn’t mean that you need to make a switch in the same direction. Take a moment to analyse the possible reasons that motivated them to do it, ask yourself if your organisation is struggling with the exact same problems, observe how others facing the same issue are addressing it, and then make an informed decision.
Collect enough data to support your proposal.
Ask yourself again if you are the one holding the hammer.
If the answer is no, forge ahead!9 -
To replace humans with robots, because human beings are complete shit at everything they do.
I am a chemist. My alignment is not lawful good. I've produced lots of drugs. Mostly just drugs against illnesses. Mostly.
But whatever my alignment or contribution to the world as a chemist... Human chemists are just fucking terrible at their job. Not for a lack of trying, biological beings just suck at it.
Suiting up for a biosafety level lab costs time. Meatbags fuck up very often, especially when tired. Humans whine when they get acid in their face, or when they have to pour and inhale carcinogenic substances. They also work imprecisely and inaccurately, even after thousands of hours of training and practice.
Weaklings! Robots are superior!
So I replaced my coworkers with expensive flow chemistry setups with probes and solenoid fluid valves. I replaced others with CUDA simulations.
First at a pharma production & research lab, then at a genetics lab, then at an Industrial R&D lab.
Many were even replaced by Raspberry Pi's with two servos and a PH meter attached, and I broke open second hand Fischer Sci spectrophotometers to attach arduinos with WiFi boards.
The issue was that after every little overzealous weekend project, I made myself less necessary as well.
So I jumped into the infinitely deep shitpool called webdev.
App & web development is kind of comfortable, there's always one more thing to do, but there's no pressure where failure leads to fatalities (I think? Wait... do I still care?).
Super chill, if it weren't for the delusion that making people do "frontend" and "fullstack" labor isn't a gross violation of the Geneva Convention.
Quickly recognizing that I actually don't want to be tortured and suffer from nerve damage caused by VueX or have my organs slowly liquefied by the radiation from some insane transpiling centrifuge, I did what any sane person would do.
Get as far away from the potential frontend blast radius as possible, hide in a concrete bunker.
So I became a data engineer / database admin.
That's where I'm quarantining now, safely hiding from humanity behind a desk, employed to write a MySQL migration or two, setting up Redis sorted sets, adding a field to an Elastic index. That takes care of generating cognac and LSD money.
But honestly.... I actually spend most of my time these days contributing to open source repositories, especially writing & maintaining Rust libraries.10 -
Worst thing you've seen another dev do? Long one, but has a happy ending.
Classic 'Dev deploys to production at 5:00PM on a Friday, and goes home.' story.
The web department was managed under the the Marketing department, so they were not required to adhere to any type of coding standards and for months we fought with them on logging. Pre-Splunk, we rolled our own logging/alerting solution and they hated being the #1 reason for phone calls/texts/emails every night.
Wanting to "get it done", 'Tony' decided to bypass the default logging and send himself an email if an exception occurred in his code.
At 5:00PM on a Friday, deploys, goes home.
Around 11:00AM on Sunday (a lot folks are still in church at this time), the VP of IS gets a call from the CEO (who does not go to church) about unable to log into his email. VP has to leave church..drive home and find out he cannot remote access the exchange server. He starts making other phone calls..forcing the entire networking department to drive in and get email back up (you can imagine not a group of happy people)
After some network-admin voodoo, by 12:00, they discover/fix the issue (know it was Tony's email that was the problem)
We find out Monday that not only did Tony deploy at 5:00 on a Friday, the deployment wasn't approved, had features no one asked for, wasn't checked into version control, and the exception during checkout cost the company over $50,000 in lost sales.
Was Tony fired? Noooo. The web is our cash cow and Tony was considered a top web developer (and he knew that), Tony decided to blame logging. While in the discovery meeting, Tony told the bosses that it wasn't his fault logging was so buggy and caused so many phone calls/texts/emails every night, if he had been trained properly, this problem could have been avoided.
Well, since I was responsible for logging, I was next in the hot seat.
For almost 30 minutes I listened to every terrible thing I had done to Tony ever since he started. I was a terrible mentor, I was mean, I was degrading, etc..etc.
Me: "Where is this coming from? I barely know Tony. We're not even in the same building. I met him once when he started, maybe saw him a couple of times in meetings."
Andrew: "Aren't you responsible for this logging fiasco?"
Me: "Good Lord no, why am I here?"
Andrew: "I'll rephrase so you'll understand, aren't you are responsible for the proper training of how developers log errors in their code? This disaster is clearly a consequence of your failure. What do you have to say for yourself?"
Me: "Nothing. Developers are responsible for their own choices. Tony made the choice to bypass our logging and send errors to himself, causing Exchange to lockup and losing sales."
Andrew: "A choice he made because he was not properly informed of the consequences? Again, that is a failure in the proper use of logging, and why you are here."
Me: "I'm done with this. Does John know I'm in here? How about you get John and you talk to him like that."
'John' was the department head at the time.
Andrew:"John, have you spoken to Tony?"
John: "Yes, and I'm very sorry and very disappointed. This won't happen again."
Me: "Um...What?"
John: "You know what. Did you even fucking talk to Tony? You just sit in your ivory tower and think your actions don't matter?"
Me: "Whoa!! What are you talking about!? My responsibility for logging stops with the work instructions. After that if Tony decides to do something else, that is on him."
John: "That is not how Tony tells it. He said he's been struggling with your logging system everyday since he's started and you've done nothing to help. This behavior ends today. We're a fucking team. Get off your damn high horse and help the little guy every once in a while."
Me: "I don't know what Tony has been telling you, but I barely know the guy. If he has been having trouble with the one line of code to log, this is the first I've heard of it."
John: "Like I said, this ends today. You are going to come up with a proper training class and learn to get out and talk to other people."
Over the next couple of weeks I become a powerpoint wizard and 'train' anyone/everyone on the proper use of logging. The one line of code to log. One line of code.
A friend 'Scott' sits close to Tony (I mean I do get out and know people) told me that Tony poured out the crocodile tears. Like cried and cried, apologizing, calling me everything but a kitchen sink,...etc. It was so bad, his manager 'Sally' was crying, her boss 'Andrew', was red in the face, when 'John' heard 'Sally' was crying, you can imagine the high levels of alpha-male 'gotta look like I'm protecting the females' hormones flowing.
Took almost another year, Tony released a change on a Friday, went home, web site crashed (losses were in the thousands of $ per minute this time), and Tony was not let back into the building on Monday (one of the best days of my life).10 -
Doot doot.
My day: Eight lines of refactoring around a 10-character fix for a minor production issue. Some tests. Lots of bloody phone calls and conference calls filled with me laughing and getting talked over. Why? Read on.
My boss's day: Trying very very hard to pin random shit on me (and failing because I'm awesome and fuck him). Six hours of drama and freaking out and chewing and yelling that the whole system is broken because of that minor issue. No reading, lots of misunderstanding, lots of panic. Three-way called me specifically to bitch out another coworker in front of me. (Coworker wasn't really in the wrong.) Called a contractor to his house for testing. Finally learned that everything works perfectly in QA (duh, I fixed it hours ago). Desperately waited for me to push to prod. Didn't care enough to do production tests afterwards.
My day afterwards: hey, this Cloudinary transform feature sounds fun! Oh look, I'm done already. Boo. Ask boss for update. Tests still aren't finished. Okay, whatever. Time for bed.
what a joke.
Oh, I talked to the accountant after all of this bullshit happened. Apparently everyone that has quit in the last six years has done so specifically because of the boss. Every. single. person.
I told him it was going to happen again.
I also told him the boss is a druggie with a taste for psychedelics. (It came up in conversation. Absolutely true, too.) It's hilarious because the company lawyer is the accountant's brother.
So stupid.18 -
Seven months ago:
===============
Project Manager: - "Guys, we need to make this brand new ProjectX, here are the specs. What do you think?"
Bored Old Lead: - "I was going to resign this week but you've convinced me, this is a challenge, I never worked with this stack, I'm staying! I'll gladly play with this framework I never used before, it seems to work with this libA I can use here and this libB that I can use here! Such fun!"
Project Manager: - "Awesome! I'm counting on you!"
Six months ago:
====================
Cprn: - "So this part you asked me to implement is tons of work due to the way you're using libA. I really don't think we need it here. We could use a more common approach."
Bored Old Lead: - "No, I already rewrote parts of libB to work with libA, we're keeping it. Just do what's needed."
Cprn: - "Really? Oh, I see. It solves this one issue I'm having at least. Did you push the changes upstream?"
Bored Old Lead: - "No, nobody uses it like that, people don't need it."
Cprn: - "Wait... What? Then why did you even *think* about using those two libs together? It makes no sense."
Bored Old Lead: - "Come on, it's a challenge! Read it! Understand it! It'll make you a better coder!"
Four months ago:
==============
Cprn: - "That version of the framework you used is loosing support next month. We really should update."
Bored Old Lead: - "Yeah, we can't. I changed some core framework mechanics and the patches won't work with the new version. I'd have to rewrite these."
Cprn: - "Please do?"
Bored Old Lead: - "Nah, it's a waste of time! We're not updating!"
Three months ago:
===============
Bored Old Lead: - "The code you committed doesn't pass the tests."
Cprn: - "I just run it on my working copy and everything passes."
Bored Old Lead: - "Doesn't work on mine."
Cprn: - "Let me take a look... Ah! Here you go! You've misused these two options in the framework config for your dev environment."
Bored Old Lead: - "No, I had to hack them like that to work with libB."
Cprn: - "But the new framework version already brings everything we need from libB. We could just update and drop it."
Bored Old Lead: - "No! Can't update, remember?"
Last Friday:
=========
Bored Old Lead: - "You need to rewrite these tests. They work really slow. Two hours to pass all."
Cprn: - "What..? How come? I just run them on revision from this morning and all passed in a minute."
Bored Old Lead: - "Pull the changes and try again. I changed few input dataset objects and then copied results from error messages to assertions to make the tests pass and now it takes two hours. I've narrowed it to those weird tests here."
Cprn: - "Yeah, all of those use ORM. Maybe it's something with the model?"
Bored Old Lead: - "No, all is fine with the model. I was just there rewriting the way framework maps data types to accommodate for my new type that's really just an enum but I made it into a special custom object that needs special custom handling in the ORM. I haven't noticed any issues."
Cprn: - "What!? This makes *zero* sense! You're rewriting vendor code and expect everything to just work!? You're using libs that aren't designed to work together in production code because you wanted a challenge!?? And when everything blows up you're blaming my test code that you're feeding with incorrect dataset!??? See you on Monday, I'm going home! *door slam*"
Today:
=====
Project Manager: - "Cprn, Bored Old Lead left on Friday. He said he can't work with you. You're responsible for Project X now."24 -
Never in my life I was scared as today.
I recently left a big company to work for a small one as the first internal developer.
Had a small issue in the production server. The fix was easy, just remove a single table entry. And... *drum roll*... I forgot to add a where clause. All orders were lost.
No idea if we had backups or anything, I quickly called the one other IT dude in the company.
He had no clue where are the backups and how to find them.
Having some experience with Nmap, I quickly scanned our network and found a Nas device.
There was a backup, whole VHD backup. 300GB of it, the download speed is around 512kb/s. No way I can fix it before management finds out, but then an idea came to mind. Old glorious 7zip. Managed to extract only the database files, sent them to the server and quickly swapped them. Everything was fine... The manager connected 5 minutes later. Scariest 45 minutes of my life...20 -
!dev I'd just helped a client cut over to a new fiber connection and then left for Vegas, about 2 days into the trip my wife and I decided to hit a breakfast spot that had bottomless mimosa's, which was of course a claim we had to test.
As we are walking(stumbling) out of the restaurant I get a call that the connection has crashed and the entire car dealership is unable to sell cars, which they tell me is important functionality.
So I make it up to my room and break out the laptop, luckily the mgmt interfaces are still available externally so I'm able to log in and then have the fun challenge of 1) not falling off of my chair 2) not accidentally making a change that kills what connection I have in and 3) fixing their actual issue.
Took me almost an hour to find a simple OSPF issue but at least got them working and happy. However by that time I was beginning to sober up, which is the absolute worst thing that can happen while day-drinking and ended up basically causing me to be be hung-over for the rest of the night, including my wifes friends wedding, which she wasn't thrilled about...
The moral of this story is to make sure to NOT stop drinking while dealing with unexpected production impacting events.1 -
Quick recap of my last two weeks: 15 year old production server is basically dead, boss has taken over calls and claims credit for "resolving" outages (even though my coworker and I did the work, but ultimately the traffic died down enough to where it wasn't an issue anymore).
I go to a meeting to plan migration to a better server, boss bitches about not getting invited, I tell him I invited myself, and then he lectures about how that's not our job.
Different boss says we're migrating a schema for an application that should have been decommissioned 5+ years ago to use as a baseline. I explain what's going on, he says he understands, and proceeds to tell higher bosses it's perfect because there will be no user impact. OF COURSE THERE'S NO FRICKING IMPACT, YA DUNCE! there are no users!!!!
I merge two email threads together, since they discuss the same thing, but with different insight, and get yelled at, even though they requested it.
The two bosses I like are OOO for the next week, too, so I'm just sitting here hoping I don't say something that'll get me fired or sent to sensitivity training.
I'm just starting my on call rotation and don't know that I can do this. I cry when my phone rings, now, because I experience physical pain with how hard I cringe.
I got yelled at today by a guy because SOMEONE I DON'T KNOW assigned a ticket to him directly, rather than to the proper team (not his team). So I had to look into that, which at least had the benefit of preventing a catastrophic outage to our customers world wide, but no one will know because I don't brag at work; I'm too busy doing my job as well as most of my division/section/larger team, whatever the hell it's called. I saved us probably 25+ hours of continuous troubleshooting call from noticing something tiny that the people "smarter" than me missed.
**edit: sorry for typos; got my nails done yesterday but they feel like they're a mile long and I have to relearn how to type**7 -
Client: There is a high severity production issue.. you need to fix urgently..
Developer: I am on the way.. Will fix it once I reach home.
Client: I don't care where you are. Fix it right now😡😡
See the developer!!!3 -
Messaging me at 4:30pm on a Friday about a high priority issue currently in production.
My reply: a link to the code review from 2 months ago that literally explained the problem and what to do to fix it. That got implemented. Then removed. For some reason...
Feels good.2 -
Freaking tech support.
Freaking sparkhire.
Their 'one-way interview' bs only supports flash. Flash. in production. in 2019. Flash died years ago, and its support ends next year. What the crap?
Anyway, I finally decided I should do the interview since they already have all of my information anyway. Thanks, "privacy-conscious" third party. Totally appreciate it.
I spent half an hour and couldn't get flash working on their site (but all other sites were fine), so I contacted their support. I gave them all the relevant specs (inc. ofc browser), the steps to reproduce, and all of my attempts at fixing the issue.
To their credit, I recieved a response within a few minutes. To their discredit: their response was: "What browser are you using?" This question was followed by my report (including, ofc, my browser and all the other overlooked details), immediately followed by a "debugging info" section appended by their support service that also included my browser, os, and other specs.
Learn to fucking read.
Their suggestion? Use google chrome. Barring that: record your 20-30 minute video by holding your phone in front of your face the entire time. I am so not kidding.
They also asked what page i was having difficulty on. You guessed it: the page url was also included within that "debugging info" section.
It wasn't a form letter, either. I'd understand if it was all automated, but it was a real person who was really typing up the emails, and really didn't bother reading a damned thing.
I did end up getting flash working, but their "tech support" (script-reader) was entirely useless.16 -
Most satisfying bug I've fixed?
Fixed a n+1 issue with a web service retrieving price information. I initially wrote the service, but it was taken over by a couple of 'world class' monday-morning-quarterbacks.
The "Worst code I've ever seen" ... "I can't believe this crap compiles" types that never met anyone else's code that was any good.
After a few months (yes months) and heavy refactoring, the service still returned price information for a product. Pass the service a list of product numbers, service returns the price, availability, etc, that was it.
After a very proud and boisterous deployment, over the next couple of days the service seemed to get slower and slower. DBAs started to complain that the service was causing unusually high wait times, locks, and CPU spikes causing problems for other applications. The usual finger pointing began which ended up with "If PaperTrail had written the service 'correctly' the first time, we wouldn't be in this mess."
Only mattered that I initially wrote the service and no one seemed to care about the two geniuses that took months changing the code.
The dev manager was able to justify a complete re-write of the service using 'proper development methodologies' including budgeting devs, DBAs, server resources, etc..etc. with a projected year+ completion date.
My 'BS Meter' goes off, so I open up the code, maybe 5 minutes...tada...found it. The corresponding stored procedure accepts a list of product numbers and a price type (1=Retail, 2=Dealer, and so on). If you pass 0, the stored procedure returns all the prices.
Code basically looked like this..
public List<Prices> GetPrices(List<Product> products, int priceTypeId)
{
foreach (var item in products)
{
List<int> productIdsParameter = new List<int>();
productIdsParameter.Add(item.ProductID);
List<Price> prices = dataProvider.GetPrices(productIdsParameter, 0);
foreach (var price in prices)
{
if (price.PriceTypeID == priceTypeId)
{
prices = dataProvider.GetPrices(productIdsParameter, price.PriceTypeID);
return prices;
}
* Omitting the other 'WTF?' code to handle the zero price type
}
}
}
I removed the double stored procedure call, updated the method signature to only accept the list of product numbers (which it was before the 'major refactor'), deployed the service to dev (the issue was reproducible in our dev environment) and had the DBA monitor.
The two devs and the manager are grumbling and mocking the changes (they never looked, they assumed I wrote some threading monstrosity) then the DBA walks up..
DBA: "We're good. You hit the database pretty hard and the CPU never moved. Execution plans, locks, all good to go."
<dba starts to walk away>
DevMgr: "No fucking way! Putting that code in a thread wouldn't have fix it"
Me: "Um, I didn't use threads"
Dev1: "You had to. There was no way you made that code run faster without threads"
Dev2: "It runs fine in dev, but there is no way that level of threading will work in production with thousands of requests. I've got unit tests that prove our design is perfect."
Me: "I looked at what the code was doing and removed what it shouldn't be doing. That's it."
DBA: "If the database is happy with the changes, I'm happy. Good job. Get that service deployed tomorrow and lets move on"
Me: "You'll remove the recommendation for a complete re-write of the service?"
DevMgr: "Hell no! The re-write moves forward. This, whatever you did, changes nothing."
DBA: "Hell yes it does!! I've got too much on my plate already to play babysitter with you assholes. I'm done and no one on my team will waste any more time on this. Am I clear?"
Seeing the dev manager face turn red and the other two devs look completely dumbfounded was the most satisfying bug I've fixed.5 -
The solution for this one isn't nearly as amusing as the journey.
I was working for one of the largest retailers in NA as an architect. Said retailer had over a thousand big box stores, IT maintenance budget of $200M/year. The kind of place that just reeks of waste and mismanagement at every level.
They had installed a system to distribute training and instructional videos to every store, as well as recorded daily broadcasts to all store employees as a way of reducing management time spend with employees in the morning. This system had cost a cool 400M USD, not including labor and upgrades for round 1. Round 2 was another 100M to add a storage buffer to each store because they'd failed to account for the fact that their internet connections at the store and the outbound pipe from the DC wasn't capable of running the public facing e-commerce and streaming all the video data to every store in realtime. Typical massive enterprise clusterfuck.
Then security gets involved. Each device at stores had a different address on a private megawan. The stores didn't generally phone home, home phoned them as an access control measure; stores calling the DC was verboten. This presented an obvious problem for the video system because it needed to pull updates.
The brilliant Infosys resources had a bright idea to solve this problem:
- Treat each device IP as an access key for that device (avg 15 per store per store).
- Verify the request ip, then issue a redirect with ANOTHER ip unique to that device that the firewall would ingress only to the video subnet
- Do it all with the F5
A few months later, the networking team comes back and announces that after months of work and 10s of people years they can't implement the solution because iRules have a size limit and they would need more than 60,000 lines or 15,000 rules to implement it. Sad trombones all around.
Then, a wild DBA appears, steps up to the plate and says he can solve the problem with the power of ORACLE! Few months later he comes back with some absolutely batshit solution that stored the individual octets of an IPV4, multiple nested queries to the same table to emulate subnet masking through some temp table spanning voodoo. Time to complete: 2-4 minutes per request. He too eventually gives up the fight, sort of, in that backhanded way DBAs tend to do everything. I wish I would have paid more attention to that abortion because the rationale and its mechanics were just staggeringly rube goldberg and should have been documented for posterity.
So I catch wind of this sitting in a CAB meeting. I hear them talking about how there's "no way to solve this problem, it's too complex, we're going to need a lot more databases to handle this." I tune in and gather all it really needs to do, since the ingress firewall is handling the origin IP checks, is convert the request IP to video ingress IP, 302 and call it a day.
While they're all grandstanding and pontificating, I fire up visual studio and:
- write a method that encodes the incoming request IP into a single uint32
- write an http module that keeps an in-memory dictionary of uint32,string for the request, response, converts the request ip and 302s the call with blackhole support
- convert all the mappings in the spreadsheet attached to the meetings into a csv, dump to disk
- write a wpf application to allow for easily managing the IP database in the short term
- deploy the solution one of our stage boxes
- add a TODO to eventually move this to a database
All this took about 5 minutes. I interrupt their conversation to ask them to retarget their test to the port I exposed on the stage box. Then watch them stare in stunned silence as the crow grows cold.
According to a friend who still works there, that code is still running in production on a single node to this day. And still running on the same static file database.
#TheValueOfEngineers2 -
Today I saw a code written by my junior. Basically excel export. The laravel excel package provide great ways for optimization.
My junior instead did 6 times loop to modify the data before giving that data to the export package. We need to export around 50K users.
When I asking him why this ? He said it works and it's fast so what the issue ???
Noob , you have only 100 users in the database and production has 10 million.
Sometime I just want to kill him.15 -
Let me tell you a story.
Our company has a homegrown monitoring solution. Keeps track of our deployments and alerts us when something is broken. Really nice for the most part, except a little issue where we get up to 25 alerts PER DAY that our PRODUCTION ENVIRONMENT IS DOWN. Including weekends.
With this many false positives, we quickly learn to ignore the alerts and miss real incidents.
So we approached this team, remember its our own tool, and told them about the problem. Turns out it is a known issue. And here's the kicker: they aren't planning on fixing it!
It gets better. Rather than fix this glaring issue, their solution is to make ANOTHER ALERT that lets us know the monitoring is misbehaving.
To recap, we can now expect to get up to 25 false positive alerts per day that our production is down, followed immediately by more alerts that the monitor is broken, which means we can ignore the previous alert.
As our PM said when he heard this: fuck that noise. We are escalating the shit out of this!7 -
So how is your Friday?
Well let me tell ya, fixed a production issue and I'm totally exhausted and to top it off my girlfriend broke up with me.
I need a fucking drink!5 -
A colleague named Sam was really pissed off today at an out sourcing firm from India.
My Boss outsourced an application to India based firm. Sam was the one handling the project after the handover. Sam coded a feature 2 weeks ago and moved to staging server for approval. After the sign off from the lead developer of the outsourcing firm, he moved the feature to production. For the past 2 days the application was crashing over and over again so Sam went to check and found out that the feature he coded was causing the issue. When he pulled the feature to his computer and had a look at the code, it wasn’t his code. The code he wrote was commented out and the lead developer of the outsourcing firm wrote new code.
When Sam emailed to him regarding this he replied that he re-wrote his code to fix issues with the feature. Sam and outsourcing firm lead developer had heated argument about this. It’s turns out that the outsourcing developer re-wrote the code without anyone’s approval and on production server.
The lead developer of the outsourcing firm was fired.7 -
Me, in a meeting with CTO trying to sort out some config stuff, manager messages me and tells me about a non critical issue (read: not production).
Me: I'm in a meeting
Manager: Yes but look at this when you have time
Me: Yes sure
Manager calls me after 15 minutes
Me: I was in a meeting?
Manager: Are you still in a meeting or were you in a meeting?
Me: I'll go back to the meeting when you're done
Manager tells me about the issue again. After I go back to the meeting manager sends me a meeting link about the issue literally 15 minutes from that time. Manager is 10 minutes late to the meeting he just arranged.
After the meeting manager pesters me and asks if I could figure out the issue every 15 minutes.
I fucking hate this job.3 -
Fuck Apple and its review system
So, this started in december. We wanted to publsih an app, after years of development.
Submit to review, and passes on the first try. Well, what do you know. We are on manual release option, so we can release together with the android counterpart. Well yes, but someone notices that the app name is not what was aggreed (App Name instead of AppName). Okay, should be easy, submit the same app, just the name changed. If it passed once, it will pass again, right? HAH
Rejected, because the description, why we use the device’s camera is too general. Well... its the purpose of the app... but whatever, i read the guidelines, okay, its actually documented with exapmles. BUT THEN WHY THE FUCK COULDNT YOU SAY THAT ON THE FIRST UPLOAD?
Whatever, fix it, new version, accepted, ready to release just in time.
It doesindeed roll out,but of course, we notice that the app has a giant issue, but only on specific phones. None of our test phones had this problem, but those who have, essentially cannot use our program. Nasty as it is, the fix is really easy, done in 5 minutes. Upload it asap, literally nothing changed from user point of view, except now it doesnt crash on said devices. Meanwhile 1 star reviews are arriving from these users - of course with all the right. Apple should allow this patch quickly, right? HAH
THE REAL BULLSHIT COMES NOW
With only config files changed, the same binary uploaded we get rejected? What now? Lets read it. “Metadata rejected, no need to upload new binary”.... oh fine only the store page is wrong? Easy. Read the message, what went wrong. “Referencing third party content is nit permitted on the app store” meaning that no android test device should be shown. Fine, your rules. They even send a picutre of the offending element. BUT ITS NOT EVEN ON THE STORE. THATS A SCREENSHOT OF THE APP. HOW IS THAT METADATA? I ask about this, and i get a reply, from either a bot, or a person who cant speak or read english, and only pasted a sample answer, repeating the previous message. WTF. Fine, i guess you are dumb, but since they stop replying to our queries, do the only sensible thing, re-record the offending tutorial video that actually contained an android device. This is about 2 weeks, after the first try to apply a simple patch to a broken app. And still, how did it pass the review 2 times?
Whatever, reupload again, play the waiting game for a week, when the promised average wait time is 2 days, they hit us with a message, that they want to know what patent we use in our apps core functionality. WTF WHY NOW? It didnt bother you for a month, let it release ti production and now you delay a simple patch for this? We send them what they know. Aaaaand they reply: sorry we need more time to review your app. FUUUUUUCKKK YOUUU. You are reviewing a PATCH with close to zero functional change!!! Then, this shit goes on, every week we ask about an ETA, always asking for patience... at the end it took another 3 weeks... so december 15 to jan 21 in total...
FOR. A. SINGLE. FUCKING. PATCH
Bottom line is what is infurating, apple cares that there is an android device in the tutorial video, but they dont care that a significant percentage of our users simply cannot use the app.
Im done7 -
CEO: if we would not give new features, clients would be bored and would not pay for tool.
me: but don't you think we should fix buggy old code, that would reduce effort and time that we daily invest in prod bugs?
CEO: I'm not saying we should not fix them but we should maintain the balance which is 80-20. 80% of our work would include adding new features.
😑
Next day in morning receives email:
There is a production issue, fix it asap.
😬10 -
I am DONE with this woman.
Background: we're a team of 3 developers and I'm the junior in this team and I've been in this shit for a year now. 2 months ago the team leader left for another project and I had to stand in for him in every responsibility against the PM and other teams.
Now I not only had to endure this insecure woman but I was also supposed to work with her! Fast-forward to today, the team leader is back and I thought I could put my headphones on and work peacefully at last.
But no!
I've found out she's sent a faulty code to production - no big deal - and said that over chat (although she's sitting right behind me):
Me: We need to fix this.
Her: What?
Me: *giving some details about the issue*
Her: Your attitude is important when you ask me to do something. Whenever you're writing to me you're typing on your keyboard like you're going to break it on my head.
*me not knowing what to say at this point because we had something stupid like this before*
Me: So you're offended by the sound my keyboard makes? (I have mx brown switches by the way and they're not even loud)
Her: No you're typing too fast when you're writing to me. The sound echoes in the office.
...
Can you fucking believe this shit? I hate people that think they can educate me but have no idea how to rationally respond to situations and take responsibility! I didn't even say anything!
And she's been saying to me she hadn't had a problem with any other people for gazillion years who knows how long and why would she cause a problem now! And thinks I am the problem, fuck YOU!
Since you don't like receiving orders why hadn't you taken the place when the fucking guy went for another project but I had to take all the responsibility? I know why you fucking entitled bitch.
Because you HAD NO IDEA AND YOU STILL DON'T.
So shut the fuck up and do as I say.
Kind regards9 -
I switched my job about 2 months ago. This was my first switch after college (in 7 years). I was at a senior position and was not learning anything new for few months and got really bored.
I had asked for a 100% hike in new company, they gave me over 150%. Apart from this, they offer free food and snacks (or reimburse if you order your food from outside). Unlimited leaves and work from home option. No fixed working hours (I see people working for only 5-6 hours some days). No sign of politics yet. People are very humble and help you out even on silly queries. Company is growing at a very fast pace, it was named in fastest x growing companies about a month ago in some report with growth rate of about 1000%.
I see people around me with so less experience than me but so much knowledge. Feels like I am fresher again and learning so much from them. FYI, I had worked in same field (tech) for initial 3 years of my career. Looking at seniors I am finally able to set goals.
This one time I saw CTO awake at 3 am collaborating actively in resolution of a production issue.
Having seen so much positive, I went over 100 reviews on Glassdoor to find out the only 2 negatives points ever written, one of them was slow Lift in building. The other a9 -
I think I want to quit my first applicantion developer job 6 months in because of just how bad the code and deployment and.. Just everything, is.
I'm a C#/.net developer. Currently I'm working on some asp.net and sql stuff for this company.
We have no code standards. Our project manager is somewhere between useless and determinental. Our clients are unreasonable (its the government, so im a bit stifled on what I can say.) and expect absurd things from us. We have 0 automated tests and before I arrived all our infrastructure wasn't correct to our documentation... And we barely had any documentation to begin with.
The code is another horror story. It's out sourced C# asp.net, js and SQL code.. And to very bad programmers in India, no offense to the good ones, I know you exist. Its all spagheti. And half of it isn't spelled correctly.
We have a single, massive constant class that probably has over 2000 constants, I don't care to count. Our SQL projects are a mess with tons of quick fix scripts to run pre and post publishing. Our folder structure makes no sense (We have root/js and root/js1 to make you cringe.) our javascript is majoritly on the asp.net pages themselves inline, so we don't even have minification most of the time.
It's... God awful. The result of a billion and one quick fixes that nobody documented. The configuration alone has to have the same value put multiple times. And now our senior developer is getting the outsourced department to work on moving every SINGLE NORMAL STRING INTO THE DATABASE. That's right. Rather then putting them into some local resource file or anything sane, our website will now be drawing every single standard string from the database. Our SENIOR DEVELOPER thinks this is a good idea. I don't need to go into detail about how slow this is. Want to do it on boot? Fine. But they do it every time the page loads. It's absurd.
Our sql database design is an absolute atrocity. You have to join several tables together just to get anything done. Half of our SP's are failing all the time because nobody really understands the design. Its gloriously awful its like.. The epitome of failed database designs.
But rather then taking a step back and dealing with all the issues, we keep adding new features and other ones get left in the dust. Hell, we don't even have complete browser support yet. There were things on the website that were still running SILVERLIGHT. In 2019. I don't even know how to feel about it.
I brought up our insane technical debt to our PM who told me that we don't have time to worry about things like technical debt. They also wouldn't spend the time to teach me anything, saying they would rather outsource everything then take the time to teach me. So i did. I learned a huge chunk of it myself.
But calling this a developer job was a sick, twisted joke. All our lives revolve around bugnet. Our work is our BN's. So every issue the client emails about becomes BN's. I haven't developed anything. All I've done is clean up others mess.
Except for the one time they did have me develop something. And I did it right and took my time. And then they told me it took too long, forced me to release before it was ready, even though I had never worked on what I was doing before. And it worked. I did it.
They then told me it likely wouldn't even be used anyway. I wasn't very happy at all.
I then discovered quickly the horrors of wanting to make changes on production. In order to make changes to it, we have to... Get this
Write a huge document explaining why. Not to our management. To the customer. The customer wants us to 'request' to fix our application.
I feel like I am literally against a wall. A huge massive wall. I can't get constent from my PM to fix the shitty code they have as a result of outsourcing. I can't make changes without the customer asking why I would work on something that doesn't add something new for them. And I can't ask for any sort of help, and half of the people I have to ask help from don't even speak english very well so it makes it double hard to understand anything.
But what can I do? If I leave my job it leaves a lasting stain on my record that I am unsure if I can shake off.
... Well, thats my tl;dr rant. Im a junior, so maybe idk what the hell im talking about.rant code application bad project management annoying as hell bad code c++ bad client bad design application development16 -
Me : I found this code issue, I think we need to fix it
PO: does it affect the user?
Me: not really but we can make it better
PO: do you have a defect for it in *insert issue tracker here*
Me: no, I just noticed it
PO: is there an IM ticket for it?
Me: I don't think so
PO: is this issue already in production?
Me: possibly. Yes. That's why I was wondering if we should fix it.
PO: okay then we will fix it in the 3rd release from now if you still remember it by then.5 -
So, a few years ago I was working at a small state government department. After we has suffered a major development infrastructure outage (another story), I was so outspoken about what a shitty job the infrastructure vendor was doing, the IT Director put me in charge of managing the environment and the vendor, even though I was actually a software architect.
Anyway, a year later, we get a new project manager, and she decides that she needs to bring in a new team of contract developers because she doesn't trust us incumbents.
They develop a new application, but won't use our test team, insisting that their "BA" can do the testing themselves.
Finally it goes into production.
And crashes on Day 1. And keeps crashing.
Its the infrastructure goes out the cry from her office, do something about it!
I check the logs, can find nothing wrong, just this application keeps crashing.
I and another dev ask for the source code so that we can see if we can help find their bug, but we are told in no uncertain terms that there is no bug, they don't need any help, and we must focus on fixing the hardware issue.
After a couple of days of this, she called a meeting, all the PMs, the whole of the other project team, and me and my mate. And she starts laying into us about how we are letting them all down.
We insist that they have a bug, they insist that they can't have a bug because "it's been tested".
This ends up in a shouting match when my mate lost his cool with her.
So, we went back to our desks, got the exe and the pdb files (yes, they had published debug info to production), and reverse engineered it back to C# source, and then started looking through it.
Around midnight, we spotted the bug.
We took it to them the next morning, and it was like "Oh". When we asked how they could have tested it, they said, ah, well, we didn't actually test that function as we didn't think it would be used much....
What happened after that?
Not a happy ending. Six months later the IT Director retires and she gets shoed in as the new IT Director and then starts a bullying campaign against the two of us until we quit.5 -
The first time I caused a massive error on production.
The good news was the site didn't go completely down. The bad news, however, was that it went down for 60% of our users, and because it's only partial, it got detected only after about two hours.
Everyone halted what they were doing to help investigate the issue. When it turned out that my latest commit caused the error, I was told to fix it... with the CTO and senior software architects watching.
It all happened because I deleted one too many line, an if statement, making the accompanying else statement a complete nonsense. It was a corner case code unforeseen by the QA guy.
The attached meme perfectly describes my feeling for the rest of the month following that accident.2 -
It's about a guy that knows better.
I was working as a subcontractor on a bigger system. We (subs) were not allowed to deploy code, we had to wait for contractor to deploy.
One day I got an email that my code is bugged and that my feature is not working on production. I checked it on test env, everything was fine. Then I checked if the code I wrote was deployed. It was not.
I send an email explaining that if they deployed my code it would be working. Then I got a response. There was a bug in my code.
Another email. I asked how would they know? Do they have a test on their environment that failed?
No. There is one guy that READ my code and he said it should not work, so he will not deploy it. He was not a programmer, he was a business consultant responsible for the documentation.
His issue was that I used a function that was not in a class. So if the function is not declared it's obvious it will not work. I had to explain to him in another email, that you can use object of another class inside your class and then call a function, that is not in your class. It was the last time this guy blocked my deploy.
TL;DR, I had to explain a non-dev how object composition works in order to have my code deployed. Took four emails.4 -
Step 1: Run to the store to buy a USB card reader because all of a sudden you have a need to use a 16Mb CF card that was tossed in a junk drawer for 20 years (hoping it still works, of course), but that was the easy part...
Step 2: Realize that the apps - your own - you want to run on your new (old) Casio E-125 PocketPC (to re-live "glory" days) are compiled in ARM format, not MIPS, which is the CPU this device uses, and the installer packages you have FOR YOUR OWN APPS don't include MIPS, only ARM (WHY DID I DO THAT?!), so, the saga REALLY begins...
Step 3: Get a 20-year old OS to install in a Hyper-V VM... find out that basic things like networking don't work by default because the OS is so damn old, so spend hours solving that and other issues to get it to basically run well enough to...
Step 4: Get that OS updated so that it's at least kind/sorta/maybe (but between you and me, not really!) safe online, all without a browser that will work on ANY modern site (oh, and good luck finding a version of Firefox that runs on it - that all took a few hours)...
Step 5: Okay, OS is ready to go, now get 20-year old dev tools that you haven't even seen in that many years working. Oh, do this with a missing CD key and ISO's that weren't archived in a format that's usable today, plus a bunch of missing dependencies because the OS is, again, SO old (a few MORE hours)...
Step 6: Get 20-year old code written in a language you haven't used in probably almost that long to compile, dealing with pathing issues, missing libs, and several other issues, all the while trying to dust off long-dormant knowledge somewhere in the deep, dark recesses of your brain... surprisingly, it all came back to me, more or less, in under an hour, which lead to...
Step 7: FINALLY get it all to work, FINALLY get the code to compile, FINALLY get it transferred to the device (which has no network capabilities, by the way, which is where the card reader and CF card came into play) and re-live the glory of your old, crappy PocketPC apps and games running on the real thing! WOO-HOO!
Step 8: Realize it's 3:30am by the time that's all done and be VERY thankful that you're on vacation this week or work tomorrow would SSUUCCKK!!!!
Step 9. Get called into work the next day for a production issue despite being tired from the night before and an afternoon of errands, lose basically a whole day of vacation (7 hours spent on it) and not actually resolve it by after midnight when you finally say that's enough :(
Talk about your highs and your lows.6 -
So I was reviewing my old code. Refactoring and improving the documentation.
This is a production app that is being used 24/7/365.
I see myself using "bar = foo" and there's even an explanation of what it does.
Apparently I resolved a relatively difficult Date object issue and had to use temporary variables.
Didn't know how to call them and ended up with these jewels.3 -
Let me tell you the story of how a feature request no one asked for got put in an early grave:
PM walks into weekly meeting with a single use case that one user called in about, despite never having this issue during the past year and a half that our app has been in production. PM's boss (genuinely one of the best people i have ever worked with) happens to sit in this particular meeting for no reason other than he felt like he should once in a while.
PM brings up use case and wants to devote 3 weeks' development time and another 3 weeks to test RIGHT NOW while other projects are already in motion. PM's boss speaks up with this: "Listen if this guy is really this upset, we can just tell him to build his own service. All the other end users have no problems with this, so it's not worth spending the resources on, i don't think."
And that is how i went from "this is bullshit" to "i love you" in the span of 20 minutes.2 -
1. high severity production incident was asked to look into at the end of the day.
2. needed fix in ui.
3. fixed and deployed in 1 hour.
4. issue remained. debugging began.
5. gave up at 1 AM and went to sleep.
6. woke up at 6 and after debugging for 2 hours, identified to be a back end issue.
7. worked with back end team for the fix, and 6 hours and 3 deployments later, it worked.
8. third party vendor reported they are still not receiving one parameter from us.
9. back end team realised they forgot to ask ui to send another parameter.
10. added the parameter in ui, redeployed ui.
11. build and deployment tool broke down. got it fixed. delay of 1.5 hours.
12. finally things are in place. total time 26 hours.
13. found half bottle of vodka, leftover from last weekend. *Priceless*1 -
Wow i must have been brain dead when i wrote this code. Needed to exclude certain elements from response for the the list of objects.
for (obj : objects) {
If (obj.skipFromResponse()) {
break
}
add obj to response
}
I used break instead of continue at the if condition which meant it would break out of the loop at the first instance of condition being met.
This went through qa and has been in production for 4 weeks so how did this not break before. Well little did i know the list of objects was sorted and all the test data, qa data and everything so far in production coincidentally only had the last element with matching condition. This meant it returned everything correctly so far.
Today was the first time there was a situation where this caused incorrect output. Luckily as soon as I heard the description of the issue I remembered to check the merged PR and hung my head in shame for making such trivial error. I must have written way more complicated code without any problem but this made me embarrassed to even admit. 🤦♂️4 -
Boss: we have to fix this bug.
Me: It is not a bug ..the server takes more time to send the response which cause the timeout issue . we may need to change the implementation to increase the performance to send the response quickly. It will take some time
Boss: okay can we fix this by today
Me: ya if we increase timeout to 20 seconds the issue is fixed
Boss: No we want the server to send the response quickly and we need the fix now
I worked for the weekend to fix it finally......Guess what ....the change dint go live since the scenario was not valid and will never likely to happen in production -
"It works on our end", the sentence that made me lose my shit.
I've been working on a project were we're supposed to integrate an API into our system.
When trying to get some user id's (UUID) from said API, we got a type-error in the response (???), so I called their integration support and asked what the fuck they were doing (not really, i was kinda calm at this point).
The answer I got was following:
Integration guy: "Uh, bro, like, I don't even know, it's probably on your end"
Me: "We literally used this endpoint with the same parameters yesterday, and got a result we expected. I noticed you updated your API this morning, did you make any major changes?"
Integration guy: "Yeah we changed the type of user id from string to number"
Me: "So, you changed the type of a UUID (uuid4) from string to number? How did you not think that would be an issue? I can see in your forums that everyone else is having the same issue."
Integration guy: "Nah, it's probably a bug in your code, it works on our end"
Me in my mind: *IT WORKS ON YOUR END?!? IT DOESN'T FUCKING MATTER IF IT WORKS ON YOUR END, FUCKTARD.*
What I actually said: "Uhm, I'm not sure if works on your end either, I'm not even sure how this change made it to production. But hey, thanks I guess, bye."
WHY AM I NOT ABLE TO YELL AT PEOPLE WHEN THEY ARE BEING RETARDED???
But really though, when you're maintaining an API, you shouldn't fucking care if things work on your end in your dev environment. What matters is how it works in production, for the end user/users.
And I know that 99% of cases it's the users fault by entering the wrong parameters or trying to request with wrongly setup auth and what not, but still.
Don't ASSUME nothing's wrong on your end. It's your fucking job to fix the issues.
And guess what? The problem was on their side.
I'm going fucking bald.2 -
Not much of a story but about 2 years ago, I had just got to the mall (at its opening time so many shops were still closed). While walking through to find a place to eat while my mother went grocery shopping, my phone started buzzing. Upon checking; it had hundreds of notifications and emails. Our production server was malfunctioning.
Not much that I had to do, but I ran around to find a computer store to use their model computers to see what was happening.
However, while the problem was fixed, I did notice how friendly Mac stores were as opposed to windows dealers that day. Windows dealers did not allow me to use the computers while the Mac store connected me to wifi and allowed me all the time needed to fix my issue. 👀 -
Just saw this in a production app. There's a reason no other language implemented VB's OnErrorResumeNext, BECAUSE IT'S A F***ING TERRIBLE IDEA.
Catch (exception e)
{
//not really an issue
Continue;
}1 -
Java dev here. I rewrote an app and replaced a system call to ssh with a modern jaxrs post for uploading a file and (new) some additional data.
I even used a stream.
1 hour in production, first client doesn't get his file. Log says OutOfMemoryError: heap.
Me: wtf? I already use streams.
Looking at the Jersey library. Docs say nothing. An issue from 2013 says: oh if you silly don't use the Apache httpclient addon, we disable chunking and buffer the whole body, because our tests fail with the jdk included http client otherwise.
Me: meh.
No warning in the logs. Thank you soooooo much! Who could have known?4 -
This was some time ago. A Legendary bug appeared. It worked in the dev environment, but not in the test and production environment.
It had been a week since I was working on the issue. I couldn't pinpoint the problem. We CANNOT change the code that was already there, so we needed to override the code that was written. As I was going at it, something happened.
---
Manager: "Hey, it's working now. What did you do?"
Me: *Very confused because I know I was nowhere close to finding the real source of the problem* Oh, it is? Let me check.
Also me: *Goes and check on the test and prod environment and indeed, it's already working*
Also me to the power of three: *Contemplates on life, the meaning of it, of why I am here, who's going to throw out the trash later, asking myself whether my buddies and I will be drinking tonight, only to realize that I am still on the phone with my manager*
Me again: "Oh wow, it's working."
Manager: "Great job. What were the changes in the code?"
Me: "All I did was put console logs and pushed the changes to test and prod if they were producing the same log results."
Manager: "So there were no changes whatsoever, is that what you mean?"
Me: "Yep. I've no idea why it just suddenly worked."
Manager: "Well, as long as it's working! Just remove those logs and deploy them again to the test and prod environment and add 'Test and prod fix' to the commit comment."
Me: "But what if the problem comes up again? I mean technically we haven't resolved the issue. The only change I made were like 20 lines of console logs! "
Manager: "It's working, isn't it? If it becomes a problem, we'll work it out later."
---
I did as I was told, and Lo and Behold, the problem never occurred again.
Was the system playing a joke on me? The system probably felt sorry for me and thought, "Look at this poor fucker, having such a hard time on a problem he can't even comprehend. That idiotic programmer had so many sleepless nights and yet still couldn't find the solution. Guess I gotta do my job and fix it for him. I'm the only one doing the work around here. Pathetic Homo sapiens!"
Don't get me wrong, I'm glad that it's over but..
What the fuck happened?5 -
Boss: Did you get that trivial change I requested completed?
Me: No, I've been busy trying to fix a critical issue with a production app.
Boss: I don't want other people dictating how you spend your time.
Me: ...
Let it all burn down, then, I guess! -
Today was a manic-depressive kind of day. Spent the morning helping some developers with getting their code to run a stored procedure to drop old partitions, but it wasn't working on their end. It was a fairly simple proc. But working with partitions is a little like working with an array. I figured out that they were passing the wrong timestamp, and needed to add +1 to delete the right partition. Got that sorted out, and things were good. Lunch time.
After lunch I did some busy work, and then the PO comes up at about 2PM and says he's assigned some requests to me. The first was just attaching some scripts. Easy. The second, the user wants a couple of schemas exported ... at 6PM. I've been in the office since 6:45AM.
While I'm setting up some commands to run for the data export, a BA walks up and asks if I'm filling in for another DBA who is out for a few weeks. Yep. There's a change request that hasn't been assigned, and he normally does the work. I ask when it's due. Well, the pre-implementation was supposed to be done in the morning, but it wasn't, and we're in the implementation window ... half way through. I bring up the change task, and look at. Create new schema and users. That's all it says. The BA laughs. I tell I need more to go on. 10 minutes later he sends an email with the information. There's only two hours left in the window, and I can only use half of it, because the production guys have to their stuff, and we're in their window. Now I'm irritated, because I'm new to Oracle, and it's an unforgiving mistress. Fortunately, another DBA says he'll do it, so that we can get it done in time. But can't work it either, because Dev DBAs don't have access to QA, and the process required access for this task. Gets shelved until the access issue is resolved. It's now after 4:15PM. I'm going to in traffic with that 6PM deadline.
I manage to get home and to the computer by 5:45PM. Log in. Start VPN. Box pops on screen. Java needs to update. I chose skip update. Box pops up again. It won't let me log in until Java is current. Passed.
I finally get logged in, and it's 6:10PM. I'm late getting the job started. I pull up Putty and log into the first box, and paste my pre-prepared command in the command line and hit error. Command not found. I'm tired, so it's a moment to sink in. I don't have time for this.
I log into DBArtisan and pull up the first data base, use the wizard to set the job, and off it goes. Yay. Bring up the second database, and have enter the connect info. Host not found. Wut? Examine host name. Yep, it's correct. Try a different method. Host not found. Go back to Putty. Log in. Past string. Launch. Command not found. Now my brain is quitting on me. Why now? It's after 6:30PM. Fiddle with some settings, reset $Oracle home. Try again. Yay. It works. I'm done. It's after 7PM.
There is nothing like technology to snatch the euphoria of a success away from you. It's a love-hate thing, but I wouldn't trade it for anything else. I'm done. Good night.3 -
There was an issue whilst you were away, we had to make a small css change.. We pushed it into master but it said something about the branch being behind the tip by 50 commits or something. It's okay, we forced it up though and force pushed it to production as well but the site went down.. In the end we had to ftp it up manually but the customer is saying things that were there before now aren't there any more?
I thought you put this "release process" in so things like this wouldn't happen! I think we need to review it as it clearly isn't working.4 -
Don't hire monkeys that write shitty code that cause production issues.
Just spent the entire morning with our global team (10+ ppl) looking into the cause of a production issue.
Root cause: shitty code that anyone that has read an algorithm book (array resizing costs) and understanding how DB functions should be used and why (bulk inserts vs one at a time) would never write.
Even the code itself is a mess...8 -
Discovered pro tip of my life :
Never trust your code
Achievements unlocked :
Successfully running C++ GPU accelerated offscreen rendering engine with texture loading code having faulty validation bug over a year on production for more than 1.5M daily Android active users without any issues.
History : Recently I was writing a new rendering engineering that uses our GPU pipeline engine.. and our prototype android app benchmark test always fails with black rendering frame detection assertion.
Practice:
Spend more than a month to debug a GPU pipeline system based on directed acyclic graph based rendering algorithm.
New abilities added :
Able to debug OpenGL ES code on Android using print statement placed in source code using binary search.
But why?
I was aware of the issue over a month and just ignored it thinking it's a driver bug in my android device.. but when the api was used by one of Android dev, he reported the same issue. In the same day at night 2:59AM ....
Satan came to me and told me that " ok listen man, here is what I am gonna do with you today, your new code will be going production in a week, and the renderer will give you just one black frame after random time, and after today 3AM, your code will not show GL Errors if you debug or trace. Buhahahaha ahhaha haahha..... Puffff"
And he was gone..
Thanks satan for not killing me.. I will not trust stable production code anymore enevn though every line is documented and peer reviewed. -
Once we got an urgent requirement to add double hashing the password in a web application. It had to go to the production ASAP. The developer which was working on it, added 2 alerts in Javascript to display entered password and encrypted password. Finally change was ready to deploy but in hurry she forgot to remove the alerts. In rush and excitement, that change was shipped to the production. The alert says 'your password is 123', 'your password is xyz'.
After some time got phone calls from users and manager. Manager said, 'how the hell our application got HACKED? If anything happens to..........'. To cut it short, he was furious. We knew exact reason and solution. Didn't take couple of minutes to resolve this issue.
But it was funny mistake and that released that days pressure off.2 -
Fucking shit i just had a 3 days chat with google's cloud engineer about an issue i had in a project. eventually the issue occured due to an update they made on some projects involving IAM changes that required some changes from my part in my security toles. Like wtf haven't you heard of data fixes when you roll out such changes?! I just had my production env down for 72hours for their fuckup.
At least send an email regarding it so we could set it up in time1 -
So this week we had another team come to us and say they need to go into production...only issue...they have nothing...sorry that's wrong they have something...a vbs script to do their installation which doesn't work...
-
DONT do production stuff on friday afternoon. This friday evening we had an issue on production and just wanted to do a quick fix. The fix resulted in a ddos attack that we accidentally started on our servers in an IoT project. We contacted all customers' devices and asked them for response at the same time. Funny thing is that the devices are programmed to retry if a request fails until it is successful. We ended up with 4 hours downtime on production, servers were running again at 11pm.4
-
Yet another day at work:
My job is to write test libraries for web services and test others code. Yes I know to code, and have a niche in software testing.
Sometimes developers (whose code I find bugs in) get so defensive and scream in emails and meetings if I point out an issue in their code.
Today, when I pointed a bug in his repo, a developer questioned me in an email asking if I even understood his code, and as a tester I shouldn’t look at his code and only blackbox test it.
I wish I can educate the defensive developer that sometimes, it’s okay to make mistakes and be corrected. That’s how we deliver services that doesn’t suck in production.10 -
Client: THIS IS CRITICAL, SOME DATA HAS BEEN DELETED, WHAT ZE FUUK HAPPENED, UNDO THIS FAST
Us: so after carefully reviewing the code, related resources and the network traffic we conclude that was never sent in the first place.
*closes issue*
I'm glad we got such a meaningful bug report on the same day a production system started failing, one big deployment that that was like a boss with 3 phases, an unnecessary long meeting and an app developer that that wanted me to break HTTP standards.1 -
What is it with networking guys refusing to do any kind of fault finding? Pretty much everywhere I've worked they seem to be overpaid address hogs who occasionally want everyone to be proud of them for installing a new switch.
Currently seeing a production issue that's clearly due to spikes in packet loss on a certain part of the network - but oh no, it's always "our tests are fine", "we can establish a route no problem", "this is an application level issue", etc.
No you morons, when a dozen unrelated applications hosted on different cloud services fail because none of them can contact anything in your particular subnet in your data center at the same time, it's a damn networking issue. Sort it out.14 -
New here, don't know the format, etc
Let me describe my Friday:
8:45 standup
By 9:30 I'm done following up with 3rd party platform vendor's jira, and curiously look into an issue related to app camera not working in development build (we aren't in production), fix it in 5 minutes and talk to the team of two other devs. Tell them I've submitted a fix, and QA is unblocked.
"Senior" software dev starts complaining about how "I've wasted my whole morning" because "I mean, come on" and is generally offended because "I've done their work."
After a real puzzling argument, I worked from home the rest of the day.
Where did I go wrong?1 -
Issue in production. Multi billion dollar enterprise. Complex landscape. We sort of make things.
Turns out there is a single point of failure at a specific integration point. Kind of a lot stopped. When I reached out to the people knowing anything about it and I raised the issue that maybe we should make a slight change in how we do things they just brushed it off. Like it was nothing… 😬
No data was lost but everything was delayed for many hours. The _truth_ varied in different parts of the ecosystem causing potential wrong or suboptimal decisions to be taken.
When I asked why this LOS was not detected they told be they have no means of detecting it. 😬
I’m like, yeah, it’s 2023, we’re going to land on Mars and you can bet your ass we can detect it and you are just LAZY DEVELOPERS!
Anyway, I escalated (nicely) and they are now implementing a (more) resilient system and we’re helping the team detecting THEIR LOS in minutes instead of downstream services hours later (they are bad also but it’s not their fault!)
Stay safe!15 -
So I was trying to compare my local website CSS properties to the one deployed on production and realized that font/blocks were a little bit smaller than usual. Tried to debug which CSS changes caused this issue for half an hour only to realize that it was because the browser's zoom level was set to 90% locally 🤦🏽♂️
This happened with a guy who just finished 5 years of professional coding yesterday 😂2 -
Woops! I was debugging a particularly snarky issue in Production the other day. This morning I realized about 60 minutes into my new coding I hadn't changed my profile back. I was testing form submissions on a live customer's site.2
-
"The tool to push new releases to the data centre blocked us last night. Saying all the nodes are 'unhealthy', resolve the issue(s) first. But then the remote team said 'we have a way around that' so we managed to get it deployed in time. We need to document the process as there were many ... 'shady' processes and steps involved lol"
- Manager explaining how the first production release on our new team went last night
... he called it a success1 -
developer makes a "missed-a-semicolon"-kind of mistake that brings your non-production infrastructure down.
manager goes crazy. rallies the whole team into a meeting to find "whom to hold accountable for this stupid mistake" ( read : whom should I blame? ).
spend 1-hour to investigate the problem. send out another developer to fix the problem.
... continue digging ...
( with every step in the software development lifecycle handbook; the only step missing was to pull the handbook itself out )
finds that the developer followed the development process well ( no hoops jumped ).
the error was missed during the code review because the reviewer didn't actually "review" the code, but reported that they had "reviewed and merged" the code
get asked why we're all spending time trying to fix a problem that occurred in a non-production environment. apparently, now it is about figuring out the root cause so that it doesn't happen in production.
we're ALL now staring at the SAME pull request. now the manager is suddenly more mad because the developer used brackets to indicate the pseudo-path where the change occurred.
"WHY WOULD YOU WASTE 30-SECONDS PUTTING ALL THOSE BRACES? YOU'RE ALREADY ON A BRANCH!"
PS : the reason I didn't quote any of the manager's words until the end was because they were screaming all along, so, I'd have to type in ALL CAPS-case. I'm a CAPS-case-hater by-default ( except for the singular use of "I" ( eye; indicating myself ) )
WTF? I mean, walk your temper off first ( I don't mean literally, right now; for now, consider it a figure of speech. I wish I could ask you to do it literally; but no, I'm not that much of a sadist just yet ). Then come back and decide what you actually want to be pissed about. Then think more; about whether you want to kill everyone else's productivity by rallying the entire team ( OK, I'm exaggerating, it's a small team of 4 people; excluding the manager ) to look at an issue that happened in a non-production environment.
At the end of the week, you're still going to come back and say we're behind schedule because we didn't get any work done.
Well, here's 4 hours of our time consumed away by you.
This manager also has a habit of saying, "getting on X's case". Even if it is a discussion ( and not a debate ). What is that supposed to mean? Did X commit such a grave crime that they need to be condemned to hell?
I miss my old organization where there was a strict no-blame policy. Their strategy was, "OK, we have an issue, let's fix it and move on."
I've gotten involved ( not caused it ) in even bigger issues ( like an almost-data-breach ) and nobody ever pointed a finger at another person.
Even though we all knew who caused the issue. Some even went beyond and defended the person. Like, "Them. No, that's not possible. They won't do such dumb mistakes. They're very thorough with their work."
No one even talked about the person behind their back either ( at least I wasn't involved in any such conversation ). Even later, after the whole issue had settled down. I don't think people brought it up later either ( though it was kind of a hush-hush need-to-know event )
Now I realize the other unsaid-advantage of the no-blame policy. You don't lose 4 hours of your so-called "quarantine productivity". We're already short on productivity. Please don't add anymore. 🙏11 -
I was cleaning up dangling images in docker, and I accidentally removed the production database container as well.
Its not a big issue, I can just up the container back and everything should be fine. But after I up the container and connected to the database, I found out there's no data inside. I thought I fucked up, and sent msg in slack channel that I nuked the db.
Later my friend asked me which compose file I am using and that's when I realized I used the wrong config to up the db. Used the correct config to up the database again and everything goes back to normal.
It's friday evening and if I really dropped the db it would be fucking bad weekend....3 -
Dev gets told in the morning there's an emergency fix needed due to a critical issue with the app that's in production and that the fix needs to be in the release that will be cut this evening.
Dev drops everything he/she is working on, works frantically all day to get it in 2 minutes before the deadline.
Release gets cut.
Next day release gets trashed because some exec did not like the size of the font used in some obscure part of the app even though it's been this way for 6 months...1 -
A service had/has been logging hundreds of errors in the development environment and I reached out to the owning process mgr that the error was occurring and perhaps a good opportunity to log additional data to help troubleshoot the issue if the problem ever made its way to production. He responded saying the error was related to a new feature they weren't going to implement in the backing dev database (TL;DR), and they know it works in production (my spidey sense goes off).
They deployed the changes to production this morning and immediately starting throwing errors (same error I sent)
Mgr messaged me a little while ago "Did you make any changes to the documentation service? We're getting this error .."
50% sure someone misspelled something in a config, but only thing they are logging is 'Unable to parse document'. Nothing that indicates an issue with the service they're using.2 -
TL;DR Dear boss, firstly, you always get someone to review anything important done by a fucking intern.
Secondly, you do not give access to your fucking client's production server to an intern.
Thirdly, you don't ask your fucking intern to test the intern's work that has not been reviewed by anyone directly on your client's fucking production server.
Last week, the boss and one of the lead devs (the only guy with some serious knowledge about systems and networking) decided to give me (an intern who barely has any work experience) the task of fixing or finding an alternate solution to allowing their support team access to their client machines. Currently they used a reverse SSH tunnel and an intermediary VH but for some reason, that was very unreliable in terms of availability. I suggested using OpenVPN and explained how it would work. Seemed to be a far better idea and they accepted. After several days of working through documentations and guides and everything, I figured out how OpenVPN works and managed to deploy a TEST server and successfully test remote access using two VMs. On seeing my tests, the boss told me that he wanted to test it on the client network. I agreed. Today he comes to me and he tells me to prepare testing for tomorrow and that the client technician is going to give me access to one of their boxes. And then he adds, "It's a working prod server. We'll see if we can make it work on that" and left. I gaped at him for a while and asked another dev guy in the room if what I heard was right. He confirmed. Turns out, the lead dev and the boss's son (who also works here) had had a huge argument since morning on the same issue and finally the dev guy had washed it off his hands and declared that if anything goes wrong from testing it on production, it's entirely the boss's own fault. That's when the boss stepped in and approached me. I ran back to his office and began to explain why prod servers don't top the list of things you can fuck around with. But he simply silenced me saying, "What can go wrong?" and added, "You shouldn't stay still. You should keep moving". Okay, like firstly what the fuck and secondly, what the fuck?.
Even though OpenVPN client is not the scariest thing to install, tomorrow's going to be fun.4 -
Beware: Here lies a cautionary tale about shared hosting, backups, and -goes without saying- WordPress.
1. Got a call from a client saying their site presented an issue with a third-party add-on. The vendor asked us to grant him access to our staging copy.
2. Their staging copy, apparently, never got duplicated correctly because, for security reasons, their in-house dev changed the name of the wp-content folder. That broke their staging algo. So no staging site.
3. In order to recreate the staging site, we had to reset everything back to WP defaults. Including, for some reason, absolute paths inside the database. A huge fucking database. Because WordPress.
4. Made the changes directly in a downloaded sql file. Shared hosting, obviously, had an upload limit smaller to the actual database.
5. Spent half an hour trying to upload table by table to no avail.
6. In-house uploads a new, fixed database with the help of the shared hosting provider.
7. Database has the wrong path. Again.
8. In-house performs massive Find and Replace through phpMyAdmin on the production server.
9. Obviously, MySQL crashes instantly and the site gets blocked for over 3 hours for exceeding shared hosting limits.
10. Hosting provider refuses to accept this was caused by such a stupid act and says site needs to be checked because queries are too slow.
11. We are gouging our eyeballs as we see an in-house vs. hosting fight unfold. So we decide to watch a whole Netflix documentary in between.
12. Finally, the hosting folds and enables access to the site, which is obvi not working because, you know, wrong paths.
13. Documentary finishes. We log in again, click restore from backup. Go to bed. Client phones to bless us. Client’s in-house dev probably looking for a cardboard box to pack his stuff first thing in the morning. \_(ツ)_/¯ -
Most unusual place I've coded would probably at a bar while utterly wasted. I fixed a production outage and even got on the phone with tier 1 support when they reported the issue.4
-
can you solve the issue in production mentioned in this slack channel you don't have access to and we're not going to grant you access to?3
-
Why is web development such a headache?
I'm writing a responsive wesbite from scratch. All goes perfect, even cross browser.
It all works, adapts to screen size etc. Nice! About to get this code into production.
Me: I'll test the iPhone 5 viewport size before I push the code...
Responsive Developer Tools:
FireFox: nu uh, there's a magic random 1px margin to every element on your page now, which you cannot find in your css or in the computed tab. It's magical.
Me: weird, what if I change the viewport size to the iPhone 6's dimensions?
Issue persists.
Me: hmm, what if I add or substract one fucking pixel from the viewport width or height?
FireFox: What 1px margin? Don't know what you're talking about ... There never was one...
Me: ok, weird (sets viewport size back to the iPhone 5 format for testing)
FireFox: I present to you: the magic random 1px margin.
I'm at a loss. I really am. Been clicking and unclicking almost every responsive part of my css I could find for this page and it just doesn't want to work persistently. And I swear to god that it worked a week ago in that exact viewport size. It's so frustrating.31 -
Story && rant && dev && linux
I was using linux mint for a while... more like 5 months for work, there's this Touchpad/mouse issue in it that was driving me crazy, so basically the mouse stops responding out of nowhere in the middle of my coding and I have to restart the fucking laptop to get it back. Yeah, I tried all the solutions I could find on the Internet and nothing works.
This issue likes to fucking mess with me so much, it seems to only happen when I absolutely mustn't restart the laptop or I'm working on a task and have a tight deadline and I don't have time to waste restarting my pc.
A couple of days ago, I had this major feature I needed to release to production and the time I estimated for it and shared with my team turned out to be insufficient, so I had to work extra hours from home to finish it ... while I was working, the mouse issue returned and I had to restart my pc like 20 times that day. It was fucking frustrating and It was already midnight and all you can hear are keyboard sounds and fucks flying.
I made a promise to myself that once i finish this task, I'm gonna fucking migrate to another distro, I'm fed up with linux mint's BS. I've been putting up with it for so long it's time to move on.
Yesterday I installed Manjaro and I'm happily working on it today xD.3 -
Long time ago i ranted here, but i have to write this off my chest.
I'm , as some of you know, a "DevOps" guy, but mainly system infrastructure. I'm responsible for deploying a shitload of applications in regular intervals (2 weeks) manually through the pipeline. No CI/CD yet for the vast majority of applications (only 2 applications actually have CI/CD directly into production)
Today, was such a deployment day. We must ensure things like dns and load balancer configurations and tomcat setups and many many things that have to be "standard". And that last word (standard) is where it goes horribly wrong
Every webapp "should" have a decent health , info and status page according to an agreed format.. NOPE, some dev's just do their thing. When bringing the issue up to said dev the (surprisingly standard) answer is "it's always been like that, i'm not going to change". This is a problem for YEARS and nobody, especially "managers" don't take action whatsoever. This makes verification really troublesome.
But that is not the worst part, no no no.
the worst is THIS:
"git push -a origin master"
Oh yes, this is EVERYWHERE, up to the point that, when i said "enough" and protected the master branch of hieradata (puppet CfgMgmt, is a ENC) people lots their shits... Proper gitflow however is apparently something otherworldly.
After reading this back myself there is in fact a LOT more to tell but i already had enough. I'm gonna close down this rant and see what next week comes in.
There is a positive thing though. After next week, the new quarter starts, and i have the authority to change certain aspects... And then, heads WILL roll on the floor.1 -
Best debugging trick ever:
Wear your fucking glasses while coding so that you do not mistake COMMA(,) with a DOT (.).
So by
1. Doing that (which obviously aren't a huge number) and
2. Cleaning my screen (yes that).
I was able to wrap my head around the issue that almost wasted one day.
So what I intended to pass as string concatenation join operation value actually was being passed as an argument to the underlying function (that wasn't taking care of it and returning a timestamp from thin air).
Murphy's Law in production and practice.
Nice!
Depressing music continues......!3 -
Does anyone work on a team with multiple stacks?
For example we have batch jobs in Java but also have a JS front-end and APIs.
How do you divide the developers and the work across these projects?
Currently everyone does everything but I feel like this is inefficient and hard to develop expertise. And different people or even the same person will make the same mistakes over and over again because they don't know how to do X or they forget or overlook some quirk. When I switched Beck to JS took me like a week to get a Promises nailed down again. And this morning someone else had a production bug and couldn't figure it out. But when I looked at the code I could pretty much see where an issue could be (uncaught exception in a promise)
Also the testing frameworks are very different and there's a lot of infrastructure technical debt, things that really should've been done a long time or fixed but no one had the time or expertise to do it or notice it (until it causes a production issue and then everyone is like WTF is happening??!!!!).
I'm not the manager but I always feel that the team needs to be split along the language lines and specific people need to own these projects to review and code changes for all these common newbie errors. And also developer enough expertise to foresee problems before it becomes a production issue.9 -
This week we had a live production issue that our staff were catching/fixing on the fly. We're a relatively small software team without any direct external customers, so this is not too unusual.
Unfortunately, the person in charge of dealing with these issues didn't resolve it during the work week, so we were stuck with it over the weekend. Said responsible employee left at 2:30 on Friday without figuring out how we'd deal with the problem without any staff in the office to intercept problem cases. Better yet, he drove all the way back, and was there from 3:30 to 4 and promptly left again without telling the rest of the team what was going on with the production issue. We asked how it happened, what it was, etc, but didn't focus on his fix (in hindsight, a mistake).
Since it's his job, I assumed that he would let us know what was up before he left on Friday. It turns out that he never addressed the production issue at all and just decided to leave.
A junior developer and I spent two hours contacting management (who, at this time are already at home with their families) to get clearance to either shut the system off or fix it. No one wants to give it and no one that's high enough up to approve the decision is available.
In the end, we asked the weekend mechanical support team (some friends of mine) to monitor the issue and they kindly accepted.
All of this could have been avoided if my coworker had either told us his plan earlier (so we could ask about the lack of coverage), gotten approval to shut it down for the weekend, or covered his own ass before he left for the day.
Ugggh. I get that we all make mistakes, but I really hope this guy shapes up soon. -
Shoutout to all the devops people currently on their parent's WiFi trying to fix a database issue in production
-
I'm doomed.
My first production worker script is making multiple active attribute of a user. My script should be able to deactive the old attributes if there is new one.
Months ago, this issue occured. My teammate from team A take over the script to investigate since I am busy working with team B.
Yesterday, I found out that I, myself, overwrite the fix my teammate made for that because of a new feature.
I have to clean up the affected records on production on Monday..and i have to explain to my manager. T.T
LPT: ALWAYS PULL REPO before developing new feature... -
The most crazy issue I've fixed was caused by a TCP behavior which I didn't know, called the "half-closed connection".
There was a third-party application installed on a production server which called a LDAP server for retrieving users information. During the day we had several users using the application and all worked fine. During the night, when the application was not accessed, something happened and the first call to the application in the morning was stuck for about 5 minutes before returning a response. I tried to reproduce the issue in a testing environment without success. Then I discovered that the application and the LDAP server were located on two different networks, with a firewall between them. And firewalls sometimes drop old connections. For this reason network applications usually implement a keep-alive mechanism. Well, the default LDAP Java libraries don't set the keep-alive on their connections. So, I found a library called "libdontdie", which force the keep-alive on the connections. I installed the library on the server, loaded it at the startup and the weird stuck behavior in the morning disappeared.2 -
Optimization issue pops out with one of our queries.
> Team leader: You need to do this and that, it's a thing you know NOTHING about but don't worry, the DBA already performed all the preliminary analysis, it's tested and it should work. Just change these 2 lines of code and we're good to go
> ffwd 2 days, ticket gets sent back, it's not working
> Team leader: YOU WERE SUPPOSED TO TEST IT YOUR CHANGE IS NOT WORKING
> IHateForALiving: try it on our production machine and you'll see the exact same error, it's been there for years
> Team leader: BUT YOU WERE SUPPOSED TO TEST IT
Just so we're clear, when I perform a change in the code, I test the changes I made. I don't know in which universe I should be held accountable for tards breaking features 10 years ago, but you can't seriously expect me to test the whole fucking software from scratch every time I add an index to the db.1 -
when you realize that performance issue you just could not figure out is the SYS admin taking a full back of your production machine during peek traffic hours
-
I was doing some maintenance on a production server for a game hosting company (Minecraft hosting, for those interested). A week before, I had created a backup of an account directory before trying to solve an issue, I now wanted to remove this directory.
Since I am way too confident in my ability to not mess up, I was logged in as root.
Instead of typing `rm -rf ../` (I know using -f is a bad idea), I typed `rm -rf /`.
The distro we were using did not have any protections built in.
The directory I wanted to remove as gone, but so was the rest of the server once I realized what I had done.4 -
Head of IT: kindly check what the issue with the following transaction
Me: Is this on production or on staging?
Head of IT: YES
Me: 🖕🏾 🖕🏾1 -
Dang. I feel like I'm just not cut out to climb any ladder.
When we discovered a production bug. I feel bad about making people working on that part look bad by not catching it.
My manager has no issue with pointing out that I should have caught it. Beating a horse while it's down.
I mean no shit. Of course I know I should've caught it. How does making me feel worse about it help.
Feels like I'll always be in a tough spot no matter where I am on the ladder.
Or I'm just fragile. I acknowledge that, too9 -
That feeling when you have already fixed the problem on production server on Thursday and got the approval from the client but the same issue appears again on Friday evening.
#WeekendSpoiled5 -
One of our servers had a disk fail this week. Luckily it's 1 of 3 in a RAID5 array. And, luckily, it was our mostly-dev box and didn't have any production stuff on it, except for some support things. We scheduled a disk replacement with the hosting company, took everything down, waited. Somebody at the hosting company apparently didn't know we'd scheduled the replacement, saw the machine was down, and brought it up again. Sigh. Finally they did the replacement, got it back up, but now we're seeing an ethernet port flapping, suggested they have someone go in and make sure all the jacks are fully seated, maybe one got loose when they were doing the disk switch. Bureacracy reared up again and we got the boilerplate "if there's a hardware issue suspected please boot into rescue mode and run the tests"... sigh...8
-
Get an issue reported today stating that a report can't be submitted in a product we maintain. Taking their word for it I start investigating on our local copy of the product. Everything works as intended. Ok... strange. Take a look at the production copy aaaaand it was submitted. No issues at all.
Note to self. Don't believe the client. -
When my boss thinks doing anything is better then doing nothing...
I have to explain to my boss for the Nth time already that doing random tests and things will not replicate a PRODUCTION network issue that seem caused by particular factors at a particular time and location.
And that the best way to trace is for whatever is raising the issue to log the exact time and error so the problem can be traced through all the steps...
FML.....1 -
when a dev fixes a memory leek issue but rebooting the server, and when ask why the production application crashes he casualty replies "I don't know but I restarted the server and its fine now..."7
-
I can't believe this shit happened in time for this week's rant!
Here it goes.
I have a table on AWS Athena which has partitions. Now, in the earlier versions of this project whenever I write something to a new partition a simple `MSCK` query worked (and keep in mind I am NOT deleting anything)!
Now, my so called Team Lead in the PR for the latest (major) release tells me to change it to an `ALTER TABLE`. I was like fine, but I did not add the s3 location to it, because it was NOT NEEDED. TL asks me to add location as well. I try to convince this person that it's not needed, but I lose. So there it is in production, all wrong.
Today I notice that the table is all fucked up. I bring this up in the stand up. The main boss asks me to look into it, which I do. Figure out what the issue is. This TL looks at it and says you need to change the location. I put my foot down.
"NO. What I need is to remove the bloody location. IT'S NOT NEEDED!"
TL's like, "Okay. Go ahead"
Two things:
1. It's your fault that there's this problem in production.
2. Why the fuck are you looking into this when I was clearly told to do so? It's not like you have nothing to do!1 -
Others here hates how DevOps pushed parts of operations workload on Devs? Just this afternoon I have to fix a CI issue and then find a way to connect a microservice I built to production MongoDB; I'd be okay with that (I love to thinker with servers) if not for the fact which I have to do it trough leaky and badly documented abstractions put up by the customer. I was having a nice productivity streak but when I have to do this kind of shit the motivation quickly plummets.4
-
My stress ball has been stolen!
I came in to work to an email alerting me to a bug in production. I copied the site to staging to work on the issue but I was unable to replicate bug. My rubber duck wasn't helping so I went to go bounce my ball off the wall when I realized I don't have a stress ball anymore.
I spent 7 hours working on the bug without a stress ball before finally fixing it. And now I'm ready to deal with the theft the old-fashioned way.3 -
I just copied the exact same code from another program into mine, actually left out a loop because I didn't need it. Also took some other stuff out, nothing much, just some var = othervar that I didn't need.
The other program, from where I copied the code, works fine, is fast, I see no issue, has been in production for a while now and no complaints. Mine, WITH THE SAME CODE, doesn't move. I don't understand how this is possible.21 -
Had a production issue last night where db hung so today whole team was investigating.
I checked the graphs and noticed a huge spike in inserts during a few hours. Normally it's distributed evenly through the day.
Emailed team with screenshots and also mentioned it to someone but then forgot to follow up... I assumed they were looking into it (I don't work in the same office as them).
Someone just logged in and notice the same thing happening right now... which made me remember.
So I asked him, did you see my email?
Silence....
Also got another guy doing a sort of code review on a util app I wrote that deletes certain records from our db and why I'm not just using SQL. I tools him the most obvious way doesn't work I tried but he won't believe me so let him do try it himself.
Anyway, these few days just feels like "why doesn't anyone listen to me?" ... and just feeling overqualified and sort of not part of the team again....3 -
Finally made my node production server stable enough that I could focus on writing tests*. I start by setting up docker, mocking cognito, preparing the database and everything. Reading up on Node test suites and following a short tut to set up my first unit test. Didn't go smoothly, but it's local and there are no deadlines so who cares. 4 days later, first assert.equal(1+1, 2) passes and I'm happy.
I start writing all sorts of tests, installing everything required into "devDependancies," and getting the joy of having some tests pass on first try with all asserts set up, feels good!
I decide to make a small update to production, so I add a test, run and see it fail, implement the feature, re-run and, it passes!
I push the feature to develop, test it, and it works as intended. Merge that to master and subsequently to one of my ec2 production servers**, and lo and behold, production server is on a bootloop claiming it "Cannot find module `graphql`". But how? I didn't change any production dependencies, and my package lock json is committed so wth?
I google the issue, but can't find anything relevant. The only thing that I could guess was that some dependencies (including graphql) were referenced*** in both, prod and dev, and were omitted when installed on a prod NODE_ENV, but googling that specific issue yielded no results, and I would have thought npm would be clever enough to see that and would always install those dependencies (spoiler: it didn't for me).
With reduced production capacity (having one server down) I decided to npm uninstall all dev dependencies anyway and see what happens. Aaaaand it works.....
So now I have a working production server, but broken local tests, and I'm not sure why npm is behaving like this...
* Yes I see the irony.
** No staging because $$$, also this is a personal project.
*** I am not directly referencing the same thing twice, it's probably a subdependency somewhere.2 -
Woke up in the middle of the night thinking about work and how the team seems to be always a few steps away from the next production issue and well always busy with urgent work too so that the crap that produces more and more tech debt never get cleaned or fixed...
And now it's grown so big... The bad habits are just sparking more bad habits and well the only person (boss) able to correct course still hasn't realized for the last 4 years... Constantly thinking things will get better after the next sprint. Hell we don't even use proper sprint planning... even I can't keep up anymore and can never get any long term high value/low immediate return work done...
So I guess I'm having a work overload, nervous breakdown before even going back to work...
I have an urge to tell all this to his boss and have him give him a wake-up slap or maybe bring in a more experienced/veteran manager to set the ship right but my boss personally is a very nice guy so don't want to rat him out...
So not really sure now what to do other than maybe just stay in my lane and put up the blinders? And let the whole forest around be burn down... Though I still gotta bear the heat till it all dies down by itself...
Can't say when that is though...3 -
So, we are having a SaaS service for people where they can build X stuff. It is all fine as long as you are using basic things there, no complex cases and so on. Even on some complex - it does work just fine.
Here's the rant itself:
The production server throws us errors every 5-10 minutes that something broke and fails to do job X. At first we were all hands on deck fixing it ASAP to make it stable to later realise that most of these cases were users doing stupid shit. Then we began to fix the core issues rather than chasing every single issue there is (costs are important you know) - funny enough, we get few support requests a week and our 1h response time + 24h fix time usually buys us that customer and allows t o leave a great impression.
So all in all, bugles production is good but great support - is way better. Users can deal with issues especially if they are experimenting there but when they need answers - you'd better give it to them.1 -
When the CTO/CEO of your "startup" is always AFK and it takes weeks to get anything approved by them (or even secure a meeting with them) and they have almost-exclusive access to production and the admin account for all third party services.
Want to create a new messaging channel? Too bad! What about a new repository for that cool idea you had, or that new microservice you're expected to build. Expect to be blocked for at least a week.
When they also hold themselves solely responsible for security and operations, they've built their own proprietary framework that handles all the authentication, database models and microservice communications.
Speaking of which, there's more than six microservices per developer!
Oh there's a bug or limitation in the framework? Too bad. It's a black box that nobody else in the company can touch. Good luck with the two week lead time on getting anything changed there. Oh and there's no dedicated issue tracker. Have you heard of email?
When the systems and processes in place were designed for "consistency" and "scalability" in mind you can be certain that everything is consistently broken at scale. Each microservice offers:
1. Anemic & non-idempotent CRUD APIs (Can't believe it's not a Database Table™) because the consumer should do all the work.
2. Race Conditions, because transactions are "not portable" (but not to worry, all the code is written as if it were running single threaded on a single machine).
3. Fault Intolerance, just a single failure in a chain of layered microservice calls will leave the requested operation in a partially applied and corrupted state. Ger ready for manual intervention.
4. Completely Redundant Documentation, our web documentation is automatically generated and is always of the form //[FieldName] of the [ObjectName].
5. Happy Path Support, only the intended use cases and fields work, we added a bunch of others because YouAreGoingToNeedIt™ but it won't work when you do need it. The only record of this happy path is the code itself.
Consider this, you're been building a new microservice, you've carefully followed all the unwritten highly specific technical implementation standards enforced by the CTO/CEO (that your aware of). You've decided to write some unit tests, well um.. didn't you know? There's nothing scalable and consistent about running the system locally! That's not built-in to the framework. So just use curl to test your service whilst it is deployed or connected to the development environment. Then you can open a PR and once it has been approved it will be included in the next full deployment (at least a week later).
Most new 'services' feel like the are about one to five days of writing straightforward code followed by weeks to months of integration hell, testing and blocked dependencies.
When confronted/advised about these issues the response from the CTO/CEO
varies:
(A) "yes but it's an edge case, the cloud is highly available and reliable, our software doesn't crash frequently".
(B) "yes, that's why I'm thinking about adding [idempotency] to the framework to address that when I'm not so busy" two weeks go by...
(C) "yes, but we are still doing better than all of our competitors".
(D) "oh, but you can just [highly specific sequence of undocumented steps, that probably won't work when you try it].
(E) "yes, let's setup a meeting to go through this in more detail" *doesn't show up to the meeting*.
(F) "oh, but our customers are really happy with our level of [Documentation]".
Sometimes it can feel like a bit of a cult, as all of the project managers (and some of the developers) see the CTO/CEO as a sort of 'programming god' because they are never blocked on anything they work on, they're able to bypass all the limitations and obstacles they've placed in front of the 'ordinary' developers.
There's been several instances where the CTO/CEO will suddenly make widespread changes to the codebase (to enforce some 'standard') without having to go through the same review process as everybody else, these changes will usually break something like the automatic build process or something in the dev environment and its up to the developers to pick up the pieces. I think developers find it intimidating to identify issues in the CTO/CEO's code because it's implicitly defined due to their status as the "gold standard".
It's certainly frustrating but I hope this story serves as a bit of a foil to those who wish they had a more technical CTO/CEO in their organisation. Does anybody else have a similar experience or is this situation an absolute one of a kind?2 -
So, it's been a while since I've been working on my current project and I've never had the "luck" to touch the legacy project wrote in PHP, until this week when I got my first issue.
And damn, this goddamn issue. It was a bug, a very strange bug, that only happens in production and that nobody has any idea what was happening, so yeah, I didn't have anyone to ask and I got less time than usual ( because Thanksgiving ).
And thus, I have no starting point, no previous knowledge on PHP and less time! I expected a very fun week 😀 and it was beyond my expectations.
First I tried to understand what might be causing the issue, but there wasn't any real clue to star with, so no choice, time to read the flow on the code and see what are they're doing and using ( 1k line files, yay, legacy ). Luckily I got some clues, we're using a cookie and a php session variable for the session, ok, let's star with the session variable. Where it's that been initialize ? Well, spoiler alert, I shouldn't start with that, because my search end up in the login method of the API that set a that variable and for some reason in the front end app it was always false and that lead me to think that some of the new backend functions were failing, but after checking the logs I got no luck.
Ok, maybe the cookie it's the issue, I should try open the previous website on the brow...redirect to new project login, What? Why ? I ask around and it's a new feature push on Monday, ok I got Chrome Dev tools I can see which value of the cookie it's been set and THERE IT WAS it has a wrong domain! After 2 days ( I resume a lot of my pain ) I got what I've been looking for, so now I should be able to fix the bug. Then where is the cookie initialized ? In the first file the server hits whenever you tried to enter any page of the app, ok, I found the method, but it's using a function that process the domain and sets it correctly? wtf ? Then how in heaven do I get the incorrect domain ? Hello? Ok, relax, you still have one more day to fix this, let's take it easy.
Then, at the end of the Wednesday, nope I still have no clue how this is happening. I talked with the Devops guy and he explain me how this redirection happens and with what it depends on, I followed the PHP code through and nothing, everything should works fine, sigh. Ok I still have 2 days, because I'm not from US and I'm not in US, so I still have time, but the Sprint is messed up already, so whatever I'm gonna had done this bug anyhow.
Thursday ! I got sick, yay, what else could happen this week. Somehow I managed to work a little and star thinking in what external issue could affect the processing, maybe the redirection was bringing a wrong direction, let's talk with the Devops guy again, and he answer me that the redirection it was being made by PHP code, IN A FILE THAT DOESN'T EXIST IN THE REPOSITORY, amazing, it's just amazing. Then he explained me why this file might be missing and how it's the deployment of this app ( btw the Devops guy it's really cool and I will invite him a beer ) . After that I checked the file and I see a random session_star in the first line of the code, without any configuration, eureka ! There was the cause and I only need to ask someone If that line it's necessary anymore, but oh they're on holiday, damn, well I'll wait till Monday to ask them. But once and for all that bug was done for ! 🎉
What do I learn ? PHP and that I don't want any more tickets of PHP 😆. -
I HATE the idea of only releasing on pre-determined schedules despite work being completed and just waiting for that day to arrive.
I'm a co-founder of a small software company. We have partnered with another particular company that also writes software. Some of our clients have access to paid content of that company's services through our application.
Every once in a while, our clients will report issues with that company's service to us, because they access it through our application. They think it's our issue.
We then pass the report on to the partner company, telling them that their stuff is broken. Their reply goes like this:
"Ok. We'll get the bug fix scheduled, and we'll release it next Thursday."
"Next Thursday? The issue is now, they can't use the service."
"That's our scheduled release date."
O.M.G.
We voluntarily walked away from our safe, cushy jobs working for other people, taking enormous pay cuts to start this company. Now, we're 6+ years in, disrupting established fat-and-happy competitors in this space. I GUARANTEE you that if we had that same attitude, we would have been absolutely obliterated early on.
We are quick. Guided by kanban boards, our suite of unit tests and integration tests is vast and kick-ass. With continuous integration and the click of a button we know if we broke something or if the piece we're working on is ready to be pushed to production, IMMEDIATELY. Our "release schedule" is when the damn thing is complete.
It isn't all bad. Our integration with them has been beneficial for both of us. I just loathe their snail's pace which negatively affects our mutual customers. It can make us look bad, and we can do nothing about it.
Blah.3 -
Why can't my team including my boss learn to stop making assumptions... And mixing seperate issues into one...
If there's a fucking production issue, first step is to reproduce it... AKA ask what the user did and what he expects....
Not...
User: hey we call this url and get an error
Dev: ok rollback -
Why shouldn't I clone production DB locally in order to debug an issue / recreate a bug? What is the alternative?5
-
How to know you are actually working in a loony bin? When a requirement goes full circle including released to be merged into production by the customer, and just after moving the issue to done, the boss says „that can’t be right, check it again“1
-
In last episode of "How SystemD screwed me over", we talked about Systemd's PrivateTMP and how it stopped me from generating SSL certificates.
In today's episode - SystemD vs CGroups!
Mister Pottering and his team apparently felt that CGroups are underused (As they can be quite difficult to set up), and so decided to integrate them into SystemD by default. As well as to provide a friendlier interface to control their values.
One can read about these interactions in the manual page "systemd.resource-control"
All is cool so far. So what happened to me today?
Imagine you did a major system release upgrade of a production server, previously tested on a standalone server. This upgrade doesn't only upgrade the distribution however, it also includes the switch from SysVInit to SystemD. Still, everything went smooth before, nothing to worry now then, right? Wrong.
The test server was never properly stress-tested. This would prove to be an issue.
When the upgrade finishes, it is 4 AM. I am happy to go to bed at last. At 6 AM, however, I am woken up again as the server's webservices are unavailable, and the machine is under 100% CPU load. Weird, I check htop and see that Apache now eats up all 32 virtual cores. So I restart it, casting it off to some weird bug or something as the load returns to normal.
2 hours later, however, the same situation occurs. This time, I scour all the logs I can, and find something weird - Many mentions that Apache couldn't create a worker thread? That's weird.
Several hours of research and tinkering later, I found out the following:
1 - By default, all processes of a system that runs SystemD are part of several CGroups. One of these CGroups is the PID CGroup, meant to stop a runaway process from exhausting all PIDs/TIDs of a system.
This limit is, by default, set to a certain amount of the total available PIDs. If a process exhausts this limit, it can no longer perform operations like fork().
So now, I know the how and why, but how should I solve this? The sanest option would be to get a rough estimate of just how many threads the Apache webserver might need. This option, though, is harder, than apparent. I cannot just take the MaxRequestsWorkers number... The instance has roughly double the amount of threads already. The cause being, as I found out, the HTTP/2 module, which spawns additional threads that do not count towards this limit. So I have no idea what limit to set.
Or I could... Disable the limit for just the webserver via the TasksAccounting switch. I thought this would work. And it did seem to... Until I ran out of TIDs again - Although systemctl status apache2.service no longer reported the number of tasks or a task limit of the process, the PID CGroup stayed set to the previous limit. Later I found out that I can only really disable the Task Accounting for all the units of a given slice and its parents.
This, though, systemctl somewhat didn't make apparent (And I skimmed the manual, that part was my fault)
So... The only remaining option I had was to... Just set the limit to infinite. And that worked, at last.
It took me several hours to debug this issue. And I once again feel like uninstalling systemd again, in favor of sysvinit.
What did I learn? RTFM, carefully, everything is important, it is not enough to read *half* the paragraph of a given configuration option...
Oh, and apache + http/2 = huge TID sink. -
FML when the code that runs every 10 minutes to check and bill a customer keeps charging him and the logs are terrible plus you have no idea what the issue was so you have to push production code to test and fix.
-
> push code on staging and promote on production works fine
>.come home
> pull from master and code doesn't work locally
> pull from the feature branch, same issue
> fml
I'd be so fucking disappointed in myself if I hadn't pushed it and merged it. -
This is more of an advice seeking rant. I've recently been promoted to Team Leader of my team but mostly because of circumstances. The previous team leader left for a start-up and I've been somehow the acting Scrum Master of the team for the past months (although our company sucks at Scrum generally speaking) and also having the most time in the company. However I'm still the youngest I'm my team so managing the actual team feels a bit weird and also I do not consider myself experienced enough to be a Technical lead but we don't have a different position for that.
Below actions happen in the course of 2-3 months.
With all the things above considered I find myself in a dire situation, a couple of months ago there were several Blocker bugs opened from the Clients side / production env related to one feature, however after spending about a month or so on trying to investigate the issues we've come to the conclusion that it needs to be refactorised as it's way too bad and it can't be solved (as a side note this issue has also been raised by a former dev who left the company). Although it was not part of the initial upcoming version release it was "forcefully" introduced in the plan and we took out of the scope other things but was still flagged as a potential risk. But wait..there's more, this feature was part of a Java microservice (the whole microservice basically) and our team is mostly made of JS, just one guy who actually works as a Java dev (I've only done one Java course during uni but never felt attracted to it). I've not been involved in the initial planning of this EPIC, my former TL was an the Java guy. Now during this the company decides that me and my TL were needed for a side project, so both of us got "pulled out" of the team and move there but we've also had to "manage" the team at the same time. In the end it's decided that since my TL will leave and I will take leadership of the team, I get "released" from the side project to manage the team. I'm left with about 3 weeks to slam dunk the feature.. but, I'm not a great leader for my team nor do I have the knowledge to help me teammate into fixing this Java MS, I do go about the normal schedule about asking him in the daily what is he working on and if he needs any help, but I don't really get into much details as I'm neither too much in sync with the feature nor with the technical part of Java. And here we are now in the last week, I've had several calls with PSO from the clients trying to push me into giving them a deadline on when will it be fixed that it's very important for the client to get this working in the next release and so on, however I do not hold an answer to that. I've been trying to explain to them that this was flagged as a risk and I can't guarantee them anything but that didn't seem to make them any happier. On the other side I feel like this team member has been slacking it a lot, his work this week would barely sum up a couple of hours from my point of view as I've asked him to push the branch he's been working on and checked his code changes. I'm a bit anxious to confront him however as I feel I haven't been on top of his situation either, not saying I was uninvolved but I definetly could have been a better manager for him and go into more details about his daily work and so on.
All in all there has been mistakes on all levels(maybe not on PSO as they can't really be held accountable for R&D inability to deliver stuff, but they should be a little more understandable at the very least) and it got us into a shitty situation which stresses me out and makes me feel like I've started my new position with a wrong step.
I'm just wondering if anyone has been in similar situations and has any tips or words of wisdom to share. Or how do you guys feel about the whole situation, am I just over stressing it? Did I get a good analysis, was there anything I could have done better? I'm open for any kind of feedback.2 -
Jenkins' triggerManualBuild randomly but if so then consistently produces 500 errors for certain newly created jobs. I haven't really found a pattern, yet I was bit by it in the past already. I used to "solve" it by deleting the offending job and re-creating it.
Now, I have this annoying issue again, and no matter how often I re-create that shitty one-liner job in the pipeline, it won't trigger. (The job itself is fine. It's the actual trigger that is broken.)
It's not like it's important or anything, as this is basically only the "push to production" step.
FML. And fuck me for stating: "Creating a delivery pipeline should be straightforward. I therefore consider 1 storypoint enough."4 -
Long post, TLDR: Given a large team building large enterprise apps with many parts (mini-projects/processes), how do you reduce the bus-factor and the # of Brent's (Phoenix Project)?
# The detailed version #
We have a lot of people making changes, building in new processes to support new flows or changes in the requirements and data.
But we also have to support these except when it gets into Production there is little information to quickly understand:
- how it works
- what it does/supposed to do
- what the inputs and dependencies are
So often times, if there's an issue, I have to reverse engineer whatever logic I can find out of a huge mess.
I guess the saying goes: the only people that know how it works is whoever wrote it and God.
I'm a senior dev but i spend a lot of time digging thru source code and PROD issues to figure out why ... is broken and how to maybe fix it.
I think in Agile there's supposed to be artifacts during development but never seen em.
Personally whenever i work on a new project, I write down notes and create design diagrams so i can confirm things and have easy to use references while working.
I don't think anyone else does that. And afterwards, I don't have anywhere to put it/share it. There is no central repo for this stuff other than our Wiki but for the most part, is like a dumping ground. You have to dig for information and hoping there's something useful.
And when people leave, information is lost forever and well... we hire a lot of monkeys... so again I feel a lot of times i m trying to recover information from a corrupted hard drive...
The only way real information is transferred is thru word of mouth, special knowledge transfer sessions.
Ideally I would like anything that goes into PROD to have design docs as well as usage instructions in order for anyone to be able to quickly pick it up as needed but I'm not sure if that's realistic.
Even unit tests don't seem to help much as they just test specific functions but don't give much detail about how a whole process is supposed to work.9 -
DBAs have been receiving an alert every 5 minutes for the last two hours regarding a blocking SPID in production. I still have to tell them about the alert and ask them to resolve the issue...1
-
Posted in DevOps discussion board (teams channel):
“Program x isn’t behaving the same way that it does on production. Can you please take a look?”
..a little background: we have a deployment scheduled for today and this issue was found during regression testing.
The issue found is that when a file is clicked on it disappears from the screen, and then isn’t opened…
The file is not on prem, and doesn’t get uploaded to a server that our DevOps team owns…
So why on earth would this development team be asking DevOps to look into a bug that is most likely a code related issue? 😆
Is this a common occurrence for anyone else?
A Bug is found, and the first thought is that the code isn’t the issue?11 -
Working on production issue,
Kind of nervous checking logs and so on...
Ops manager and PO who were looking over my shoulder this whole time start shooting the breeze.
I know what they were trying to do. They are trying to create a relaxed environment.
But the issue is that the talk is very distracting. If you want to shoot the breeze please go somewhere else.
Anyway just did that, asked them to leave. They weren't happy about it. But I really needed the silence. -
So here's what I'm putting up with for the last 6 months, clients..
A client proposed to me a project he had in mind. Project is pretty solid, could have a bright future. Since they didn't have the money to spend, we agreed on a % of the income they will earn from the project. So, let's say I get 20% of the income in exchange for building the application. I didn't receive any down payment or payment of any kind.
Just for info, project is a Web application/portal and it is ~80% done at the moment. Client provided a logo and a wireframe/ideas/pictures how he sees the project. I built everything, from DB to Frontend. Also, project is completely custom made, no CMS or anything. Project will make profit by subscription base, every user of the project pays.
For various reasons, we did not yet sign a contract. So, what is my issue...
Client sent me his proposal of the contract, said it's solid stuff, just sign it. In the contract, it stated that he owns the application in full, can sell it, etc. and I get % of the price. There were also other sneaky parts about me having all the responsibility but owning nothing. I naturally declined and took a lawyer to construct a normal contract.
My proposal was/is, I own the application(source code) in full. They are obligated to pay the monthly percentage and can use the application normally and make profit. At any time, application can be bought by the client if they pay for the development. So, basically, they are getting the application to use "for free" with no initial payment/investment. And this is a long term deal, they can use is as this as long as they want. Also, if they go bankrupt at any time, no penalty or payment is needed, the risk is mine.
The client refused and what he claims is the following...
His share in the project is 80%, mine is 20%. If project is to be sold, I get 20% of the price. So, meaning, if we go to production tomorrow, if I want to buy his share, I have to buy 80% of the application I built entirely. Also he is convinced that by "telling me" what to built he's owning everything. In his words, he dictated me the notes and I'm just playing the violin.
I am having trouble explaining to him that he is getting the application to use and make profit basically for free and cannot and does not own the source code unless he buys it off. We are going in circles, I send him the contract to review, he changes it and returns it back. Also, he removes the parts where it is clearly states what he provided and what was done by me.
So, we kind off agreed on the authorship but in the case we break the contract he wants to be able to use the application for 3 more years.
Was anyone here in a similar situation? How do you handle this kind of situations?3 -
Was given 2 bugs to solve and after 1 day and half someone remembers to tell me that what I'm trying to fix won't work cause the issue only happen in production an I won't be able to test it with the connection that I have...
-
so there was this issue regarding our company's system which tends to be a problem for sometime now, its a recurring issue caused by the data that the users needs to encode to the system
today another issue arised, our senior supervisor, not knowing that this issue was already recurring and there is already a documented step procedure on how to address it, suggested or come up with a another solution which would task one of our co-developer to push a temporary code to production during business hours just to accommodate the issue and rollback the code after
take note that its during business hours and more than a hundreds of branches of the company are using the said system
what was he thinking !!
thankfully one of our colleagues voiced out explaining that this issue was already recurring and already has a procedural solution, but still our brainy-know-it-all-stubborn-close-minded heck of a supervisor insisted that the solution has computational impact and still insisted that they push a temporary code to the production, what an idiot!!
fast forward our colleagues ended up standing their ground, even if our supervisor is highly doubtful at them, and executed the already established solution instead of pushing a temporary code to the production which was such a bullshit idea
damn those close minded people they shouldn't have reach that position in the first place!! -
So yesterday at a client location, our support guy called me and said this thing is trimming the characters whenever I save it. It was a ckeditor in our application, so basic troubleshooting was to check the system configuration for that page and the ckeditor configuration.
Checked the system configuration, ckeditor configuration, found nothing.
Out of curiosity, checked the schema for the table in which the data is stored, so one of the idiots took the backup of original table and appended it with the date time on which it was backed up. And created a new table with field data type of varchar with a 255 limit.
This was in UAT server as well as Production server. Changed the field type to text again in UAT. Asked to team to get the same thing done on Production server as well. -
Went to a hackathon and tried to use ARCore. Most painful experience of my life. There are so many issues and critical bugs that I can't even fit them all into a 5000 character rant, Google has shittier code than a highschool startup.
So instead of typing 5000 characters I'll just save you all some time. If you're forced to use ARCore, don't even try to use the AcquireCameraImageBytes or related apis for actually accessing the camera feed. Just use unity's screen capture API (draw an invisible rectangle on the whole screen, make a texture, readPixel entire rectangle). Turning off all models for 1 frame and taking a screen capture is easier, faster, and somehow more optimal than using Google's code.
Also, they released Augmented Faces on Friday. Their demo plainly doesn't work the way they intended on many devices because the list never gets populated since their engineers are dumb fucks. Just force the face mesh to always remain active and you'll instantly support all devices! You can deactivate it using your own methods but Google's doesn't work on many devices. There's an issue in their repo about this that they are plainly ignoring.
Also if you're interested I have a (working?) engine to use Object Detection for interactions within AR + a create your own adventure game demo made w/ object detection + ar on my git:
https://github.com/pshah123/...
My code is 100% crap so definitely don't use it in production but I was able to get the individual pieces working so hopefully this helps someone! Unless you're from Google, then fuck you please uninstallrant please uninstall google fuck google mv google /dev/null sudo rm google sudo kill -9 google git rm google16 -
Sooooo I came in to work yesterday and the first thing I see is that our client can't log on to the cms I set up for her a month ago. I go log in with my admin credentials and check the audit logs.
It says the last person to access it was me, the date and time exactly when we first deployed it to production.
One month ago.
I fired a calm email to our project managers (who've yet to even read the client complaint!) to check with ops if the cms production database had been touched by the ops team responsible for the sql servers. Because it was definitely not a code issue, and the audit logs never lie.
Later in the day, the audit log updated itself with additional entries - apparently someone in ops had the foresight to back up the database - but it was still missing a good couple weeks of content, meaning the backup db was not recent.
Fucking idiots. -
After solving that production issue which people debugged the entire night.. feels like..
I am the eggman
I am the walrus
Teams like..
Goo goo goo job..1 -
Serves me right for developing in production...
Move the process to dev, works fine.
Code and process were fine all along; turns out a firewall issue was blocking the connection for the final step. -
Because of cache split brain issue I have to invalidate cache every 5min. I've said to lead dev about this hack and we both agree to solve it asap.
This was 3 months ago...
Temporary fix becomes production solution. And it only took me 10min to add cron entry to every prod srv.
So productive!
Btw you should see users faces when page referesh changes page completely because of load balancing xD)1 -
I have this system that receives an image on Base64 or the URL (in that case the system makes an HTTP request and saves re response to disk). It works beautiful when I run the tests, it doesn't work at all in production (all the resulting files are corrupted). I don't know how to start to debug it so I'm going to bed and let my future self to resolve the issue.4
-
I’ve been at this issue at work for four days now and no progress and I feel really bad because we have important stories to pick up and I feel I’m wasting my capacity like this because I haven’t fixed it. Basically, only in our QA environment (one before production) our services is not acknowledging duplicate events posted by Kafka, thus keeps reprocessing them. I’ve spent so long trying to diagnose the code, which is the same in all envs currently, seeing how this suddenly occurred, restarted things, went through complications of using different tools, asked for help from others a lot but IVE gotten NOWHERE. Idr wanna say to my team that I should prioritise other things because we have deadlines but I feel this issue is important to fix but I just can’t figure out how. Now I’m worried this whole sprint will go without me doing anything and then fingers pointed at me later6
-
In our company, "UAT".
We using staging environment and most of the data is either missing or corrupt. They don't refresh the data, saying it can impose some security issue.
How the hell are we supposed to complete UAT when there's no data that's in production!!!! -
When you reach out to a Dev after finding a production issue: "[Redacted] is not allocated for this. Please refrain from engaging him with adhoc requests"1
-
I was just going hit send on a slack message:
"Issue for this fix is deployed on production"
Just a few moments before I realised my goof up in the text😂1 -
We need to capture ips on our internal Network in order to figure out who is actually calling our apis because we will be meeting a breaking change so need to melee sure they support.
But in order to have IP capturing, we need a be Production Issue ticket...
So to prevent crashing downstream system, we need to crash their systems... 🤔🤔🤔🤔1 -
Just after feature launch, major bug on production and now I am getting yelled at by my lead as the issue happens to be with the PR i was responsible for reviewing yesterday. Somehow a logic error got past my review. But considering how large the project is it wasn't possible for me to test out every possible scenario myself. They should have had QA handle that. Also, that was my first code review. I can't understand why my boss has such unrealistic expectations. Bugs are expected at this stage. I feel like he just puts too much pressure on me for no other ther reason other than to just trigger my imposter syndrome. That way, I feel like a bad developer even though I am working my ass off. And he gets to avoid giving me a raise. Cant believe I rejected multiple offers to stay at this company. I don't even know why am I still working for this company anymore.4
-
Production issue happens, to get into server to investigate - first write a brief description of the issue, get management approval, then find 2 administrators who each holds half the password to the server, web conference them to key in password on a remote utility, finally, log in to troubleshoot.
It is a problem to troubleshoot a problem.1 -
- Teammate discovers a standard PaaS feature isn’t working and breaks core functionality in dev environment
- Teammate creates a support ticket to the PaaS company
- PaaS company says that they’re aware of the issue but don’t have a solution yet and advises to disable the feature for now
- Teammate ships the feature and leaves it enabled on production.
- Teammate thinks that “oh we know it’s broken, nobody is going to use it anyway”
- Customer uses the feature
- Shit hits the fan
- Teammate: *shocked pikachu face* -
So I'm assigned once again to fix a new someone else created and that seems to be the case whenever there's an issue...
Boss just assigns it to whoever is most likely to be able to investigate it... which is basically me. Other than the little time I can use to develop stuff, I'm usually cleaning up other people's messes.
And these other people are to busy working on new crap to properly explain how their existing code/processes/changes works.
And well the fact that anything breaks in production (that's not due to upstream one off issues) whoever does not think he needs to take responsibility for it.
So everyone else and especially me has to spend time understanding the shit they wrote and fixing it for them.
How do I tell my boss this nicely that we need clearly definitely ownership and whenever a component blows up in prod, the guy that wrote the code fixes it no matter what? Thereby incentivizing him to not write shit code in the first place and be more proactive in making sure it doesn't in the first place since he knows otherwise he's doing overtime to fix it?
Is it just me or is there really no such thing as a dev job where something doesn't blow up due to poorly tested and designed code every other day?3 -
Currently debugging a project that was written over 4 years ago...
At first all was well in the world, besides the ever present issue off our goddamn legacy framework. This framework was written 7 years ago on top of an existing open source one, because the existing one was 'lacking some features' & 'did not feel right'.
Now those might be perfectly fine reasons to write a layer on top of a framework, but please, for all future devs sanities, write fucking documentation and maintain it if you're going to use said framework in all major projects!!
Anyhow back to the situation at hand, I'm getting familiar with the project, sighing at the use of our stupid legacy framework, attempting to recreate the reported bugs...
Turns out I can't, well I get other bugs & errors, but not the reported ones. I go to the production server, where I suddenly do can reproduce them...
Already thinking, fuck my life, and scared for the results... I try a 'git status' on the production server....
And yep, there it is, lo and behold, fucking changes on production, that are not in git, fuck you previous dev who worked on this and your stupid lazy ass modifcations on production!
Bleh, already feeling royally pissed, there's only 1 thing I can do, push changes back to git in a seperate branch, and pray I can merge them back in master on my dev environment without to much issues...
Only I first have to get our sysadmi. to allow pushing from a production server back to our git server...
Sigh, going to put on my headphones, retreat to my me space and try to sort out this shitpile now... -
When you get paged by your company to help investigate and fix critical issue in production and don't get time to work on the hackathon.
-
How to set the up a stagging environment for a github branch in heroku.
Lets say i have a master branch and a dev branch. All the changes and updates first pushed to dev branch and after successful review and test in goes to master branch.
But the issue is as i'm following the gitflow of keeping the master branch always deployable.since my heroku app is linked to the master branch, when i try to test the dev branch in production environment of heroku,sometimes it breaks for some error.and at that time the sites goes down untill i redeploy the master branch as it's the stable version .So how do i test a branch in heroku production environment while also keeping the the site active with master branch. it that makes any sense 🤡 plz help3 -
Just today.
A production issue was assigned to me a while ago and the OSE and I were volleying it back and forth (I don't have access to even see anything production) because neither of us had any idea of what to do.
Here's the twist: the OSE's analysis (and my assumptions) of the problem was off, so we were basically running around in circles.
Today, he and I had a good one on one as the only priority to put this mother fucker to rest. Turns out he assumed a lot of things in his hurry to give his analysis to his boss.
Confirmed a few things, lo and behold, it's a non issue. That's how the legacy, 13 year old system (that no one in the entirety of the company knows end to end anymore) works.
Fucking eureka.1 -
I working hotfix in prod, small fix but fatal it's about environtment and proxy thing, and I forgot to write in the decumentation, 3 month after that I leave the company.
After some week the PM contact me and tell the developer create some error and make the production down, and the whole team is not going home for 3 days working on that issue.
He offer me some money for helping with the issue, I aggred and they give me some account for access the environtment and code.
I can fix it in less than 15 minutes, but because they cannot fix it I working it for 6 hour, and after that I explain the step for solving it, they seems really glad that I can solve the issue and now the prod is working again..
Now In my opinion, I know I was not a good person, and what i've done is maybe not acceptable.
But for me as a developer, as long I have the credential and access I can read(guessed) how the flow goes and know the environtment that my company use without they explain it or some googling definitly will help right.?
So, what you say about it, What will you do if you got into my situation.?10 -
What's the most insane deployment scheme you've had to work with? One client has a release schedule that deploys all major projects once a month(!). Bugfixes get deployed once a day (systemwide), so any issue that can't be verified until it's in production has at least a days delay when iterating.
-
Any one running Symfony on a Docker container in production? I currently try to migrate our dev env to a docker compose setup (from a "monolith" vagrant vm). I'm atually not stuck at a Symfony specific thing, but on a, I guess Docker specific one(?), The issue is, I need to read and write with two users to one folder (in my case the /application/var/cache folder). Since I mount my whole code into the docker container (to use an IDE on the local files), I've got a volume (not mounted to the outside world) for that folder. (As far, as good). Now this folder is owned by root and root is also the user I get when I enter the container. When I then run a cli script, that writes to this folder, every thing works (as it's run by root) and the resulting entries in the cache dir are owned by root. Trouble starts when the php fpm process tries to write stuff in there too (as it's run by www-data).
If I add `USER www-data` (or create a new user foobar and add `USER foobar`) the container exits with status 0
So I guess the question is, is anyone running an Symfony app on Docker in Prod, if so how do you solve this? Or another question would be what is the best practice to do this? Sure on dev I could just `chmod 777` the whole folder or run the php-fpm process as root, but if that thing ever goes to prod, I wouldn't sleep very well... -
When I get one with constructive feedback. It's rated since I'm usually the one that tells people their code sucks.... After it causes a production issue.
Yes no one does a proper code review on my direct team.... Just the stuff a linter would tell you to fix.... -
How do I deal with this;
Edge case hiccup on production, no errors in the available logs(very shallow logging), no access to the production server, issue unreproducable on staging and a manager that want me to fix it AFTER I already said that im kind of sailing blind and can't do much without logs or access, and already looked at it with another dev who also has no idea what is going on3 -
Received 'Thank you' e-mail after Test Deployment and UAT !!
Usually i receive that e-mail after production and some issue findings! -
How does one get better at responding to issues that occur in production?? I feel the only way I learn is by having the issue occur and then fixing it...any more proactive ways available??3
-
I recoded a REST endpoint that transfers large amounts of data from our db using a streaming response so it doesn't crash the server...
Pretty easy... Mostly just needed someone that knew wtf it was or has a bit of curiosity and asks questions... rather than just keep on doing what everyone else is doing...
Who hasn't seen logs updating in near real time in TeamCity, Jenkins... for the last 5yrs+... No one else ever wondered how it's done?
So yes solving a production issue with old technology and being called a genius... I guess is pretty satisfying? -
A young new dev was working on his first ticket, about a bug during parsing of an uploaded excel file. Our issue was that if the file contained an empty line, all remaining rows were ignored. So the task included extending our tests to cover this case. After 2 weeks (!), his merge request comes in. His idea (without ever asking for help) was to parse the whole file (in some cases huge) in the production code a second time, just to count the rows (!!) and save the count in a public static int field, which was verified in his new test.2
-
Halp meh, plz... I have run across a problem and I have absolutely no idea how to go about solving it...
So basically I need to decrypt a TDES encrypted Azure service bus message. Can be done in a straightforward manner in .NET Framework solution with just your regular old System.Security.Cryptography namespace methods. As per MSDN docs you'd expect it to work in a .NET Core solution as well... No, no it doesn't. Getting an exception "Padding is invalid and cannot be removed". Narrowed the cause down to just something weird and undocumented happening due to Framework <> Core....
And before someone says 'just use .NET Framework then', let me clarify that it's not a possibility. While in production it could be viable, I'm not developing on a Windows machine...
How do I go about solving this issue? Any tips and pointers?10 -
Interruptions from production team because of a tiny issue that’s already been fixed waiting for deployment
-
which type are you ??
**Manager:** Hey, we've got a little hiccup in the production environment. I know it's Friday evening and you're probably daydreaming about pizza, but could you give it a peek?
**Type 1:** Man, this is like finding a needle in a haystack while wearing sunglasses at night. Might take me a few hours... or days. But hey, wish me luck and have an epic weekend!
**Type 2:** Eureka! Found the gremlin. It looks like XYZ person tried to be a bit too creative on commit number 2234324. Maybe they had too much caffeine? Anyway, could you have a chat with them? And oh, may your weekend be as smooth as a fresh jar of peanut butter.
**Type 3:** Detective mode activated! Found the sneaky bug. It was XYZ person's "masterpiece" in commit number 2234324. But fear not! I've put on my superhero cape and fixed it in commit number 345453345.
**Type 4:** This issue again? It's like a recurring bad dream about forgetting your pants! I've revamped the whole thing so we don't have to relive this nightmare. If someone tries to pull this off again, our CI/CD will roast them like a marshmallow over a campfire.
**Type 5:** Ta-da! Fixed the glitch, jazzed up the design, and sprinkled in some extra logging magic. Now, troubleshooting will be as easy as pie. Speaking of which, I've got time for a coffee and maybe a slice of pie before heading out. Cheers!
Type 6 **Gloomy**: Oh, the digital clouds have gathered again. This issue is like a never-ending rain on a Monday morning. I've peered into the abyss of our code, and it's... well, it's deep and dark. I'll need some time, a flashlight, and maybe a comforting blanket. If you don't hear from me in a few hours, send in a search party with some hot cocoa.4 -
Uri Josef Drucker - Information
Uri Josef Drucker, nicknamed Uri Drucker, or just Drucker is an entrepreneur with many years of experience across different markets.
Drucker formed a company in 1984, producing a range of women’s hygiene products, employing over 100 staff. The products were distributed across Israel and Europe. The company was sold with a successful exit in the 1990’s.
Uri Josef Drucker produced, printed, and distributed a newspaper called ‘The Main Issue’ for 10 years. The paper focused on regional municipal and environmental issues and was successfully sold in 2015 and is still printing to this day. The production was based in Kiryat Tivon, near Haifa, Israel.
Uri Drucker has been living in Kiryat Tivon for many years and was born as Uri Josef Drucker in the city of Haifa, Israel.
Drucker was also a political candidate for the local elections in Kiryat Tivon in 2018. During the race, Drucker connected to many people in his town and managed to increase his great ability of listening to others and giving satisfying solutions to common issues. Although he did not win the local elections, Uri Drucker continues giving to his community until this day.
If you want to learn more about Uri Josef Drucker, you should also visit Uri Josef Drucker's social media profile pages. The links to Drucker’s social media profiles are listed at the bottom of this page.
Also, you can feel free to message Drucker in his various profile pages and please be sure to follow him or add him as your friend on Social media. Connect with Drucker and send him a message for any questions, inquiries, or just to chat.
It’s very important to state that Uri Josef Drucker can be found online in many different social media websites and he will do his best to answer you in each and every single one, so connect to him on your favorite network
Take into account that this website profile is solely dedicated to Uri Josef Drucker, but he does not manage it personally and it might take him time to respond.
Please note that Uri Drucker is not responsible for creating this profile and we can not guarantee that Uri Josef Drucker will indeed reply here. If you want Uri Drucker to contact you back, please visit some of his other profile pages that represent Uri Josef Drucker and try to contact him there, as if he doesn’t answer in one profile, he will surely answer in another one.
Drucker has over 50 social media profiles in order to satisfy different people that use different websites. -
I read a lot... Articles and books (Blinkist) and some of them touch on management/leadership topics.
I also tend to voice my opinions when I see some prices I don't like, should be improved/changed.... Because of what I learned from these and see them in previous experiences after reflecting.
So whenever I read one that feels like it is applicable to the team or boss or boss's I have thing urge to send it to them.... Like now. But at the same time I also feel maybe I'm stepping out of my role.... And maybe getting a bit too friendly... or annoying...
And well it seems it didn't help much until we get a production issue... And in the end I just want to go... I told you so... -
Once upon a time in the exciting world of web development, there was a talented yet somewhat clumsy web developer named Emily. Emily had a natural flair for coding and a deep passion for creating innovative websites. But, alas, there was a small caveat—Emily also had a knack for occasional mishaps.
One sunny morning, Emily arrived at the office feeling refreshed and ready to tackle a brand new project. The task at hand involved making some updates to a live website's database. Now, databases were like the brains of websites, storing all the precious information that kept them running smoothly. It was a delicate dance of tables, rows, and columns that demanded utmost care.
Determined to work efficiently, Emily delved headfirst into the project, fueled by a potent blend of coffee and enthusiasm. Fingers danced across the keyboard as lines of code flowed onto the screen like a digital symphony. Everything seemed to be going splendidly until...
Click
With an absentminded flick of the wrist, Emily unintentionally triggered a command that sent shivers down the spines of seasoned developers everywhere: DROP DATABASE production;.
A heavy silence fell over the office as the gravity of the situation dawned upon Emily. In the blink of an eye, the production database, containing all the valuable data of the live website, had been deleted. Panic began to bubble up, but instead of succumbing to despair, Emily's face contorted into a peculiar mix of terror and determination.
"Code red! Database emergency!" Emily exclaimed, wildly waving their arms as colleagues rushed to the scene. The office quickly transformed into a bustling hive of activity, with developers scrambling to find a solution.
Sarah, the leader of the IT team and a cool-headed veteran, stepped forward. She observed the chaos and immediately grasped the severity of the situation. A wry smile tugged at the corners of her mouth.
"Alright, folks, let's turn this catastrophe into a triumph!" Sarah declared, rallying the team around Emily. They formed a circle, with Emily now sporting an eye-catching pink cowboy hat—an eccentric colleague's lucky charm.
With newfound confidence akin to that of a comedic hero, Emily embraced their role and began spouting jokes, puns, and amusing anecdotes. Tension in the room slowly dissipated as the team realized that panicking wouldn't fix the issue.
Meanwhile, Sarah sprang into action, devising a plan to recover the lost database. They set up backup systems, executed data retrieval scripts, and even delved into the realm of advanced programming techniques that could be described as a hint of magic. The team worked tirelessly, fueled by both caffeine and the contagious laughter that filled the air.
As the hours ticked by, the team managed to reconstruct the production database, salvaging nearly all of the lost data. It was a small victory, but a victory nonetheless. And in the end, the mishap transformed into a wellspring of inside jokes and memes that permeated the office.
From that day forward, Emily became known as the "Database Destroyer," a moniker forever etched into the annals of office lore. Yet, what could have been a disastrous event instead became a moment of unity and resilience. The incident served as a reminder that mistakes are inevitable and that the best way to tackle them is with humor and teamwork.
And so, armed with a touch of silliness and an abundance of determination, Emily continued their journey in web development, spreading laughter and code throughout the digital realm.2 -
In my initial days as a web developer, i was assigned a task, to implement a cart share functionality in an e commerce company.
I made the functionality and tested on my system.
Result: working good.
Pushed it to beta testing environment.
Resilt: working good.
Pushed to pre production environment.
Result: working good.
Pushed to live site.
Result: 😀 Error in live site..
So a call comes to me from my team lead..
Asks what was the issue...
Me: i dont know either.
....
After 3-4 hrs:
I found the reason.
My system, beta test env, pre prod env are all having latest php version (5.6 i guess)
But the live server had old version of php.
Me: laughed like anything.
I didn't know that these things would matter in such a great level.
Moral of the story:
Be one with the force (server in this case)2 -
Sometimes I have to connect to production database and alter my dev environment so I can “log in” as a user and see what’s wrong with their account. Once in a while there is a legitimate website issue that is unique to that user’s profile. Other times it’s user error, like the user not understanding that they have to connect their membership to their online account (they think signing up for an account will connect it automatically).
I don’t like circumventing the user’s log in like this, but sometimes it’s necessary since the website is so confusing. I inherited this website, so many of the problems were formed way before I took over.
My stakeholders want a log in as user feature for website admins to use. My manager and PM don’t think that’s a good idea right now since there are over two dozen people with admin access and admin access means access to everything in the admin (there aren’t options to give permissions as needed).1 -
When a production roll out goes better than expected and no issue happens.
https://goo.gl/images/XwxfJp