1 Comment

Neat! I haven't done modern GPU programming, but I wonder if you can do some analog to double-buffering, which is a standard technique for display, especially when I/O plays a role.

The basic idea is to do the calculations on one set of the data, while doing I/O (or display) on the previous iteration. You then swap where you're doing things (once you're done with I/O, or ready to display the new frame).

In this case, you'd be, e.g. using some of the half of the GPU units to be calculating, and the other half (which had older, finished calculations) would be dumping their results back to the CPU. The trick is getting the two phases roughly similar in terms of time, otherwise it's not a big benefit.

I guess it's really just another view on pipelinining.

Anyway, thank you for an interesting and well-written article!

Expand full comment