Join devRant
Do all the things like
++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
Sign Up
Pipeless API
From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
Learn More
Search - "-o3"
-
Had to wirte and optimize a C++ program that finds for 1000x1000x1000 grid points the 100 nearest points for each (with an additional factor to make it more complicated).
It had to run in under 18 minutes to pass. No matter what I did I couldn't get it fast enough. I tried kd-trees, caching of certain points, optimizing distamce calculations by ommiting any irrelevant factor, saving points' calculated squares etc etc. When Ibwas down to 20 minutes, I realized, that my makefile had an error and ignored the - O3 flag...
Well, it actually ran 5 minutes with -O3.8 -
So I have a teacher that when he use "C++" it is basically C with a .cpp file-extension and -O0 compiler flag.
Last assignment was to implement some arbitrary lengthy calculation with a tight requirement of max 1 second runtime, to force us to basically handroll C code without using std and any form of abstraction. But because the language didn’t freeze in time 1998, there is a little keyword named "constexpr" that folded all my classes, arrays, iterators, virtual methods, std::algorithms etc, into a single return statement. Thus making my code the fastest submitted.
Lesson of the story, use the language to the fullest and always turn on the damn optimizer
Ok now I’m done 😚7 -
Here is another rather big example of how C++ is WAY slower than assembler (picture)
Sure - std::copy is convenient
but asm is just way faster.
This code should be compatible with EVERY x86_64 CPU.
I even do duffs device without having the loop:
the loop happens in the rep opcode which allows for prefetching (meaning that it doesnt destroy the prefetch queue and can even allow for preprocessing).
BTW: for those who commented on my comment porn last time: I made sure to satisfy your cravings ;-)
To those who can't make sense of my command line:
C++ 1m24s
ASM 19s
To those who tell me to call clang with -o<something>:
1) clang removes the call to copy on o3 or o2
2) the result isnt better in o1 (well... one second but that might be due to so many other things, and even if... one second isn't that much)25 -
Everyone wants faster programs, so doing more optimisations with GCC at -O3 instead of -O2 makes the program quite a bit larger, but... SLOWER. Makes sense, right? Why do you even have -O3 if it generates larger AND slower binaries than -O2?
Ah IC, it's because you use that level only on individual hot functions, not on the full program. How do I do that? Function attribute for optimisation. Cool. Uhm, what is the exact syntax? The fucking GCC documentation doesn't say that. When will devs finally learn to give bloody EXAMPLES?!
Googling around. Ah, with quotes, but without the leading hyphen it seems. Copy/paste. Compile again, tadaa: it's only a little bit but still FUCKING SLOWER than -O2!
GCC's -O3 is like that stupid kid at McD that ate like a damn horse, had to vomit afterwards and was even more hungry than before!13