1
rox79
4y

I got situation here,

I am getting 524 error from cloud fare. I sent some data using AJAX, process it and then return the result. Since the data is large and have some SQL manipulation on it so it take a lot of time. I put the process in back end. But still even for 10k records it took 4-5 minutes to process, Issue is everything works fine but since cloud fare response time is 1-2 minute so it through 524 error (as it does not getting any response within its time frame). How I am suppose to tackle this. May be using job scheduler now ? My client simply refuse to send small data. My Friend is suggesting don't use ajax, simply reload the page. But again data is too much so page loading will also through 524 error. Kindaa stuck here. Any idea/suggestion how I can proceed.
Language I am using PHP. Database, MySQL and SQL.
Hmm Here is some more explanation
https://github.com/marcialpaulg/...
But not working
Here is also something
https://stackoverflow.com/questions...
But I am thinking why redirecting ? It doesn't make sense to me

Comments
  • 2
    The part you're doing wrong is trying to process large data by sending it to a web server and waiting for a response. That's not what web is for. Also I'm 99% sure you're trying to do something really dumb, because when you have a legit reason to send that much data and process it on a server, you normally already know how to do it the right way (hint: it doesn't include waiting for the task to complete within the same request).
  • 0
    What @hitko said. Now you may want to look into event-driven architecture and sse (though I’m not actually sure how to apply to PHP)
  • 0
    There's a few things that come to mind to get this working again.

    1) optimise the query to reduce execution time.

    2) once optimised to the point it doesn't get any better, load it to a temporary file instead of sending it back to the browser, and zip it up.
    Then email a link back to the user and download the file and have it deleted.

    3) if this still takes to long, run it as a cronjob from the server side, create a db table with a queue of sorts and schedule a job to read this db table and any record in that table would generate said data export and remove its self from the queue - this would remove the time dependency as the server is initiating the php job and not apache / Nginx.

    4) if it still doesn't work, log into the DB yourself and run the query directly 😅

    5) go back to 1, but now add limit and offset to the end of the query and make them arguments you can send through and download the data in chunks instead of the entire thing at once.

    5.1) I see you made it back here, now try again with smaller chunk s until the thing works.
  • 1
    @C0D4 yeah 3 seems the last option. 😔 If nothing helps today I will create batches and run cron
  • 1
    @rox79 n:o 3 would pretty much be my first option if this had even the remotest possibility of becoming a recurring issue (also, n:o 1 as well - optimize the fuck out of everything!)
  • 1
    @rox79 you may want to reread for future you to consider options.

    @100110111 depends on skill level, #1 is usually enough though, unless it's a tonne of data or really poor joins.
  • 0
    @C0D4 fair enough. My issue with optimizing the queries as the only course of action is that it doesn’t necessarily scale well (assuming the db also gets new data inserted into and actually grows in size). I personally would want to remove a future pain point away as early as possible. People don’t think of scalability nearly enough, I’ve come to notice. We have an example on an application that I work with, where some daily run queries where designed and implemented 3-4 years ago, - ran fast nuff back then, but stull big nuff a job to need a solution quite like your 3rd point - but the data to be queried has grown so vast the job now takes over 2h to complete. It* was easy to circumvent but the problem will persist until I have the time to tackle the issue. I know how to fix it, but I’m stuck with more urgent issues now, so it’ll have to wait.

    * the job had a timeout set for 2h, where it would be considered as failed and retried. So for a while we got the results of the job twice and since it ran twice, it also incurred computing costs.
Add Comment