Significant further gains are possible by simply unrolling the loop eight or 16 times to lower the overhead of the DBF per word written
Significant further gains are possible by simply unrolling the loop eight or 16 times to lower the overhead of the DBF per word written