Ranter
Join devRant
Do all the things like
				++ or -- rants, post your own rants, comment on others' rants and build your customized dev avatar
				Sign Up
			Pipeless API
 
				From the creators of devRant, Pipeless lets you power real-time personalized recommendations and activity feeds using a simple API
				Learn More
			Comments
		- 
				
				Oh and the worst thing is, even if I remove the call to the rasterize function, at which point the shader has no externally visible side effects and thus should be optimized to basically a no-op, it still takes 0.25 ms... of 0.6 ms total... without doing literally anything
 
 Does jensen huang have a deal with the devil or something, sacrificing babies in order for this graphics cards to be able to go backwards in time and compute things before they even exist??
- 
				
				iirc GPUs now have more overhead cost to do simple functionality because they're optimized to do complicated functionality
 
 so basically if you do something simple it goes through the pipeline to do complex things and the two have the same performance
 
 this way they didn't have to put different pipelines in the GPU and could stuff more raw power into it for advanced games without wasting space real-estate for basic old video games functionality... basically power over adaptability
- 
				
				@Demolishun No, it's basically what UE5's Nanite is doing. It's pretty crazy though because "compute shader all the things" has been a meme for quite a time now, but holy shit, I didn't know that "don't use the literal built-in hardware at all" was also a thing
- 
				
				@jestdotty I feel like theres an xkcd about this.. the more features you have the slower it gets. And then some newcomer comes and is insanely fast.... until they have feature parity, at which time they are just as slow lol
Related Rants





 Found this in our codebase, apparently one of my co-workers had written this
Found this in our codebase, apparently one of my co-workers had written this
 Me if i was a game dev
Me if i was a game dev
Sometimes I just don't know what to say anymore
I'm working on my engine and I really wanna push high triangle counts. I'm doing a pretty cool technique called visibility rendering and it's great because it kind of balances out some known causes of bad performance on GPUs (namely that pixels are always rasterized in quads, which is especially bad for small triangles)
So then I come across this post https://tellusim.com/compute-raster... which shows some fantastic results and just for the fun of it I implement it. Like not optimized or anything just a quick and dirty toy demo to see what sort of performance I can get
... I just don't know what to say. Using actual hardware accelerated rasterization, which GPUs are literally designed to be good at, I render about 37 million triangles in 3.6 ms. Eh, fine but not great. Then I implement this guys unoptimized(!) software rasterizer and I render the same scene in 0.5 ms?!
IT'S LITERALLY A COMPUTE SHADER. I rasterize the triangles manually IN SOFTWARE and write them out with 64-bit atomic image stores. HOW IS THIS FASTER THAN ACTUAL HARDWARE!???
AND BY LIKE A ORDER OF MAGNITUDE AT THAT???
Like I even tried doing some optimizations like backface cone culling on the meshlets, but doing that makes it slower. HOW. Im rendering 37 million triangles without ANY fancy tricks. No hi-z depth culling which a GPU would normally do. No backface culling which a GPU with normally do. Not even damn clipping of triangles. I render ALL of them ALL the time. At 0.5 ms
rant
wtf
wtaf
gamedev
shader