Hacker News

pan69 14 hours ago [ - ]

It's certainly the most rudimentary. Small optimisation on the inner-loop would be to pre-calculate the scanline offset before going into the pixel loop:

    int s = y*screenRect.w;
    
    for (int x = 0; x < screenRect.w; x++) {
       pixels[s + x] = argb(255, frame>>3, y+frame, x+frame);
    }

kmill 12 hours ago [ - ]

I'd be surprised if the compiler didn't make that optimisation on its own.

canyp 9 hours ago [ - ]

Possibly, but always check the assembly.

The even faster version, opts aside, would be to initialize the pointer at y*screenRect.w and ++ at every loop to avoid the addressing arithmetic.

kmill 5 hours ago [ - ]

Certainly check the assembly, but loop invariant code motion and strength reduction are basic optimizations. C compilers tend to be good at optimizing indexing patterns even at -O1.

Take a look, GCC and Clang go further than these suggestions by adding screenRect.w to the pointer each iteration to avoid the multiplication: https://godbolt.org/z/YfroqK7T6

Writing anything but pixels[y*screenRect.w + x] in an attempt to be faster, without checking the assembly first, is obfuscation.

(For what it's worth, you can beat the compiler by using *pixels++. I didn't profile the code to check it actually was faster in practice however.)