69

My team handles infrastructure deployment and automation in the cloud for our company, so we don't exactly develop applications ourselves, but we're responsible for building deployment pipelines, provisioning cloud resources, automating their deployments, etc.

I've ranted about this before, but it fits the weekly rant so I'll do it again.

Someone deployed an autoscaling application into our production AWS account, but they set the maximum instance count to 300. The account limit was less than that. So, of course, their application gets stuck and starts scaling out infinitely. Two hundred new servers spun up in an hour before hitting the limit and then throwing errors all over the place. They send me a ticket and I login to AWS to investigate. Not only have they broken their own application, but they've also made it impossible to deploy anything else into prod. Every other autoscaling group is now unable to scale out at all. We had to submit an emergency limit increase request to AWS, spent thousands of dollars on those stupidly-large instances, and yelled at the dev team responsible. Two weeks later, THEY INCREASED THE MAX COUNT TO 500 AND IT HAPPENED AGAIN!

And the whole thing happened because a database filled up the hard drive, so it would spin up a new server, whose hard drive would be full already and thus spin up a new server, and so on into infinity.

Thats probably the only WTF moment that resulted in me actually saying "WTF?!" out loud to the person responsible, but I've had others. One dev team had their code logging to a location they couldn't access, so we got daily requests for two weeks to download and email log files to them. Another dev team refused to believe their server was crashing due to their bad code even after we showed them the logs that demonstrated their application had a massive memory leak. Another team arbitrarily decided that they were going to deploy their code at 4 AM on a Saturday and they wanted a member of my team to be available in case something went wrong. We aren't 24/7 support. We aren't even weekend support. Or any support, technically. Another team told us we had one day to do three weeks' worth of work to deploy their application because they had set a hard deadline and then didn't tell us about it until the day before. We gave them a flat "No" for that request.

I could probably keep going, but you get the gist of it.

Comments
  • 3
    That sounds... Interesting. Well done for not giving in to stupid requests!
  • 0
    @JustThat Yeah, DevOps is a rough gig. Instead of streamlining the development and deployment process, our team just has to magically do the work of two different teams with half the people.

    We're moving towards actual DevOps but it's slow going.
  • 0
    "And the whole thing happened because a database filled up the hard drive, so it would spin up a new server, whose hard drive would be full already and thus spin up a new server, and so on into infinity."

    that's just fucking gold...

    i feel sorry for you, but it's funny!
  • 0
    Great rant! I love when people mess up and try to blame it on with their stupid requests. My team could use this flat “NO” answer but instead they say yes, whine, curse and complain to our CIO instead of saying “we need more than one day to do this...”. They basically do everything for these people that the users now think it’s our job to do this and that for them. This is how’s its been since I joined them. Smh
Add Comment