Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "bugs are scary"
-
It's funny, whenever the subject of facebook vs privacy comes up (mostly I don't even initiate those convo's), people always start to defend facebook when I say that I THINK that facebook is build to get people addicted to it and get them to stay on facebook as long as possible.
Haha, one of facebook's early investers/ex facebook presidents said the following in an interview:
“It’s a social-validation feedback loop, exactly the kind of thing that a hacker like myself would come up with, because you’re exploiting a vulnerability in human psychology.”
So even an ex president of facebook is admitting this.
I also found the folloing a good one:
The underlying thought process while creating platforms like Facebook or Instagram is something like “How do we consume as much of your time and conscious attention as possible?”
Last but not least, the part I found the most scary:
“God only knows what it’s doing to our children’s brains.”
Yes, I find this scary.
Oh yeah and for the people who are going to call bullshit on this one, I've got one source and if you search engine on the title of that article then you'll find loads of websites having that story:
https://fossbytes.com/facebook-was-...26 -
I run update without where on mysql console on production database Today.
CLASSIC
Just because I needed to fix database after bug fix on the backend of the application.
I thought I wrote good sql statement after executing it on my local machine and then everything got bad.
Luckily it was only one column with some cached statistics data and I checked that it was not important data before I actually started fixing stuff but still ...
Almost got hard attack afterwards.
Made a script to fix this column and it took me only 15 minutes but still...
Bug was caused in part I got no unit tests and application grow after 3 years of development from simple one for one customer and volumes of documents around 50k to over 40 customers and volumes over 2mil per month, don’t know how many pages each, just in one year after we completed all needed features.
I have daily backups and logs of every api operation but still.
I think this got to far for one backend developer.
I got scared that I will loose money cause I am contractor and the only backend developer working on it.
I am so tired of this right now I think I need a break from work.
Responsibility is killing me so hard right now.
It will take a week to get back to normal.2 -
A couple of years ago I was working on a fairly large system with a complex (by necessity) access control architecture.
As is usually the case with those projects, it's awkward for developers to repro bugs that have to do with a user's accesses in production when we are not allowed to replicate production data in test, let alone locally.
We had a bug where I ended up making myself a new row in the production database for a thing I could have access to without affecting real data to repro it safely. I identified the bug so I could repro it in dev/test and removed the row and ensured everything worked normally, whew scary.
Have you ever walked into the office one day, and everyone is hunched over in a semicircle around one person's workstation, before one turns around to look at you and says - after a pause - "... ltlian?.."
Turns out I had basically "poisoned the well" with my dummy entity in a way where production now threw 500 for everyone BUT me who had transitive access to this post-non-entity. Due to the scope of the system, it had taken about a day for this to gradually propagate in terms of caching and eventual consistencies; new entities coming in was expected, but not that they disappear.
Luckily I had a decent track record for this to be a one-off. I sometimes think about how I would explain testing in prod and making it faceplant before going home for the day, other than "I assumed it would be fine". I would fire me.3 -
I hate the elasticsearch backup api.
From beginning to end it's an painful experience.
I try to explain it, but I don't think I will be able to cover it all.
The core concept is:
- repository (storage for snapshots)
- snapshots (actual backup)
The first design flaw is that every backup in an repository is incremental. ES creates an incremental filesystem tree.
Some reasons why this is a bad idea:
- deletion of (older) backups is slow, as newer backups need to be checked for integrity
- you simply have to trust ES that it does the right thing (given the bugs it has... It seems like a very bad idea TM)
- you have no possibility of verification of snapshots
Workaround... Create many repositories as each new repository forces an full backup.........
The second thing: ES scales. Many nodes / es instances form a cluster.
Usually backup APIs incorporate these in their design. ES does not.
If an index spans 12 nodes and u use an network storage, yes: a maximum of 12 nodes will open an eg NFS connection and start backuping.
It might sound not so bad with 12 nodes and one index...
But it get's pretty bad with 100s of indexes and several dozen nodes...
And there is no real limiting in ES. You can plug a few holes, but all in all, when you don't plan carefully your backups, you'll get a pretty f*cked up network congestion.
So traffic shaping must be manually added. Yay...
The last thing is the API itself.
It's a... very fragile thing.
Especially in older ES releases, the documentation is like handing you a flex instead of toilet paper for a wipe.
Documentation != API != Reality.
Especially the fault handling left me more than once speechless...
Eg:
/_snapshot/storage/backup
gives you a state PARTIAL
/_snapshot/storage/backup/_status
gives you a state SUCCESS
Why? The first one is blocking and refers to the backup status itself. The second one shouldn't be blocking and refers to the backup operation.
And yes. The backup operation state is SUCCESS, while the backup state might be PARTIAL (hence no full backup was made, there were errors).
So we have now an additional API that we query that then wraps the API of elasticsearch. With all these shiny scary workarounds like polling, since some APIs are blocking which might lead to a gateway timeout...
Gateway timeout? Yes. Since some operations can run a LONG (multiple hours) time and you don't want to have a ton of open connections hogging resources... You let the loadbalancer kill it. Most operations simply run in ES in the background, while the connection was killed.
So much joy and fun, isn't it?
Now add the latest SMR scandal and a few faulty (as in SMR instead of CMD) hdds in a hundred terabyte ZFS pool and you'll get my frustration level.
PS: The cluster has several dozen terabyte and a lot od nodes. If you have good advice, you're welcome - but please think carefully about this fact.
I might have accidentially vaporized people sending me links with solutions that don't work on large scale TM.2