Do any of these tests measure the new experimental tail call interpreter (https://docs.python.org/3.14/using/configure.html#cmdoption-...)?

I couldn't find any note of it, so I would assume not.

It would be interesting to see how the tail call interpreter compares to the other variants.

The build of Python that I used has tail calls enabled (option --with-tail-call-interp). So that was in place for the results I published. I'm not sure if this optimization applies to recursive tail calls, but if it does, my Fibonacci test should have taken advantage of the optimization.

The tail calls in question are C tail calls inside the inner interpreter loop. They have nothing to do with Python function calls.

That tells you how much I know about the feature. :) But in any case, I'm positive that the flag was enabled, so my results are with tail calls. I suppose part of the difference between 3.13 and 3.14 could be thanks to this.

Good to know! Thanks for confirming. Yes, I would guess that the tail call interpreter explains part of the difference between 3.13 and 3.14. Previously the overall improvement to the interpreter has been measured at 1-5%, or even 10-15% depending on the compiler version you are using: https://blog.nelhage.com/post/cpython-tail-call/

If your benchmark setup is easy to re-run, it would be awesome to see numbers that compare the tail call interpreter to the build where it is disabled, to isolate how much improvement is due to that.

It wouldn’t have, since

    fib(n-1) + fib(n-2)
isn’t a tail call—there’s work left after the recursive calls, so the tail call interpreter can’t optimize it.