And a barebones traditional CPU is a finite state machine plus random access memory. It teaches you mostly the same things about how you put together simple components into universal computation, while having programs that are far easier to comprehend.
And then for another perspective on computation, lambda calculus is very different and can broaden your thoughts. Then you could look at Turing machines and get some value, but niche value at that point. I wouldn't call it important if you already understand the very low level, and you should not use it as the model for teaching the very low level.
>while having programs that are far easier to comprehend.
If you want to learn the fundamentals of something, should you not wish to you know, think about the fundamentals?
My argument is that FSM+tape and FSM+RAM are at the same level of "fundamental", but one is easier to understand so it should be the thing you teach with. Being more obtuse is not better.