Optimized a query today. Before it timed out after 10 minutes, now it takes 4.3 seconds. Very proud.

  • 2
    Hope it still carries out the same function as well...
  • 8
    @kamen I can make it so fast, it doesn't even exist heh
  • 6
    we had a query that was so bad that it wouldn't run in a week. i got it down to 20s. there was a ton of garbage on it lol
  • 0
    n^2 -> log(n)?
  • 1
    @Lensflare more like n^2 -> itlog(n)?
  • 1
    @cb219 sry I don’t know what itlog means. I googled it and read something about eggs. Still confused.
  • 1
    @Lensflare BECAUSE EGGS
  • 3
    ahhh, reminds me my first time i tried to optimize a query, got it from like 10 seconds to like 120 milliseconds, felt amazing :)

    (wasn't as awesome of an optimization as it sounds like, since it was basically just written (by me) in the worst way possible the first time, since i gave it zero thought)
  • 1
    @Lensflare oh sry, just googled myself. 😂
    I used a notation I've learned in university. Apparently not the "official" one. I meant log*(n). Iterated log.
  • 2
    gz :)

    been there some years ago and sp_blitz helped me https://www.brentozar.com/blitz/

    it was so satisfying to say that a query which executes multiple times a day got reduces from a minute or so to a few seconds just because I removed implicit conversation from varchar to nvarchar.
  • 0
    What was the issue?
  • 0
    Did you just change the query or also added/improved indexes to your tables?
  • 0
    Damn, looking at what I wrote, i think I could write an article about it

    Alright, time to explain what I did. Essentially we had a big query that would search for all device ids that should get a push notification for a new message while also considering the accounts currently on the device and the device settings.

    This in itself isn't that expensive. What made the difference between the new performance and the old performance was that we also had to calculate how many unread messages are currently on all accounts & sub-accounts of the device.

    After examining the query I found that the part that was taking the most time was the condition where I'd check if the message should go to one of the devices sub-accounts, which is saved in an array in the message.

    The fix for this was to
    1. Split this query in 2 part: Querying the receiver devices and counting the unread messages
    2. Apply a GIN-index on the array of sub-accounts and replace `receiver_id IN message.sub_accoubts` with `ARRAY[receiver_id] <@ message.receivers` which essentially means "check if the array on the left occurs in the array on the right".

    I know the second step sounds crazy, but that's essentially the only way I found make it use an index, because `IN` doesn't make use of them in postgresql at least.

    Thanks to TDD for making this refactoring possible without bringing down prod.
Add Comment