14

Most painful code error you've made?

More than I probably care to count.

One in particular where I was asked to integrate our code and converted the wrong value..ex

The correct code was supposed to be ...

var serviceBusMessage = new Message() {ID = dto.InvoiceId ...}

but I wrote ..

var serviceBusMessage = new Message() {ID = dto.OrderId ...}

At the time of the message bus event, the dto.OrderId is zero (it's set after a successful credit card transaction in another process)

Because of a 'true up' job that occurs at EOD, the issue went unnoticed for weeks. One day the credit card system went down and thousands of invoices needed to be re-processed, but seemed to be 'stuck', and 'John' was tasked to investigate, found the issue, and traced back to the code changes.

John: "There is a bug in the event bus, looks like you used the wrong key and all the keys are zero."
Me: "Oh crap, I made that change weeks ago. No one noticed?"
John: "Nah, its not a big deal. The true-up job cleans up anything we missed and in the rare event the credit card system goes down, like now. No worries, I can fix the data and the code."
<about an hour later I'm called into a meeting>
Mgr1: "We're following up on the credit card outage earlier. You made the code changes that prevented the cards from reprocessing?"
Me: "Yes, it was my screw up."
Mgr1: "Why wasn't there a code review? It should have caught this mistake."
Mgr2: "All code that is deployed is reviewed. 'Tom' performed the review."
Mgr1: "Tom, why didn't you catch that mistake."
Tom: "I don't know, that code is over 5 years old written by someone else. I assumed it was correct."
Mgr1: "Aren't there unit tests? Integration tests?"
Tom: "Oh yea, and passed them all. In the scenario, the original developers probably never thought the wrong ID would be passed."
Mgr1: "What are you going to do so this never happens again?"
Tom: "Its an easy addition to the tests. Should only take 5 minutes."
Mgr1: "No, what are *you* going to do so this never happens again?"
Me: "It was my mistake, I need to do a better job in paying attention. I knew what value was supposed to passed, but I screwed up."
Mgr2: "No harm no foul. We didn't lose any money and no customer was negativity affected. Credit card system may go down once, or twice a year? Nothing to lose sleep over. Thanks guys."

A week later Mgr1 fires Tom.

I feel/felt like a total d-bag.

Talking to 'John' later about it, turns out Tom's attention to detail and 'passion' was lacking in other areas. Understandable since he has 2 kids + one with special-needs, and in the middle of a divorce, taking most/all of his vacation+sick time (which 'Mgr1' dislikes people taking more than a few days off, that's another story) and 'Mgr1' didn't like Tom's lack of work ethic (felt he needed to leave his problems at home). The outage and the 'lack of due diligence' was the last straw.

Comments
  • 2
    This is a perfect example of why type safe identifiers are a good idea.
Add Comment