x86 has decades of knowhow and a zillion transistors to spend on making the memory pipeline, TLB caching & prefetching etc. etc. really really good. They work as well as they do despite the 4k base page size, not because of it.

If you'd start from a clean sheet today you'd probably end up with a somewhat bigger base page size. Not hugely larger though, as that wastes a lot of memory for most applications. Maybe 16k like some ARM chips use?