This is a lot of basically sharpshooting, but I will address your last point:
> There was no way to move from 32-bits to >32-bits without every network stack of every device element (host, gateway, firewall, application, etc) getting new code. Anything that changed the type and size of sockaddr->sa_family (plus things like new DNS resource record types: A is 32-bit only; see addrinfo->ai_family) would require new code.
That is simply not true. We had one bit left (the reserved/"evil" bit) in IPv4 headers that could have been used to flag that the first N bytes of the payload were an additional IPv4.1 header indicating additional routing information. Packets would continue to transit existing networks and "4.1" capable boxes at edges could read the additional information to make further routing decisions inside of a network. It would have effectively used IPv4 as the core transport network and each connected network (think ASN) having a handful of routed /32s.
Overlay networks are widely deployed and have very minor technical issues.
But that would have only addressed the numbering exhaustion issues. Engineers often get caught in the "well if I am changing this code anyway" trap.
An explicit goal of IPv6 considered as important as the address expansion was the simplification of the packet header, by having fewer fields and which are correctly aligned, not like in the IPv4 header, in order to enable faster hardware routing.
The scheme described by you fails to achieve this goal.
I am glad you brought this up, that is another big issue with IPv6. A lot of the problems it was trying to solve literally don't exist anymore.
Header processing and alignment were an issue in the 90s when routers repurposed generic components. Now we have modern custom ASICs that can handle IPv4 inside of a GRE tunnel on a VLAN over MPLS at line rate. I have switches in my house that do 780 Gbps.
It is irrelevant what we can do now.
At the time when it was designed, IPv6 was well designed, much better than IPv4, which was normal after all the experience accumulated while using IPv4 for many years.
The designers of IPv6 have made only one mistake, but it was a huge mistake. The IPv4 address space should have been included in the IPv6 space, allowing transparent intercommunication between any IP addresses, regardless whether they were old IPv4 addresses or new IPv6 addresses.
This is the mistake that has made the transition to IPv6 so slow.
> The IPv4 address space should have been included in the IPv6 space, allowing transparent intercommunication between any IP addresses, regardless whether they were old IPv4 addresses or new IPv6 addresses.
How would you have implemented it that is different from the NAT64 that actually exists, including shoving all IPv4 addresses into 64:ff9b::/96?
Ideally, 464XLAT should have been there from the beginning and its host part (CLAT) should have been a mandatory part of IP stack.
> The IPv4 address space should have been included in the IPv6 space […]
See IPv4-mapped ("IPv4-compatible") IPv6 addresses from RFC 1884 § 2.4.4 (from 1995) and follow-on RFCs:
* https://datatracker.ietf.org/doc/html/rfc1884
* https://en.wikipedia.org/wiki/IPv6#IPv4-mapped_IPv6_addresse...
But v6 did do what you're describing here?
They didn't use the reserved bit, because there's a field that's already meant for this purpose: the next protocol field. Set that to 0x29 and it indicates that the first bytes of the payload contain a v6 address. Every v4 address has a /48 of v6 space tunnelled to it using this mechanism, and any two v4 addresses can talk v6 between them (including to the entire networks behind those addresses) via it.
If doing basically exactly what you suggested isn't enough to stop you from complaining about v6's designers, how could they possibly have done any better?
> That is simply not true. We had one bit left (the reserved/"evil" bit) in IPv4 headers […]
Great, there's an extra bit in the IPv4 packet header.
I was talking about the data structures in operating systems: are there any extra bits in the sockaddr structure to signal things to applications? If not, an entirely new struct needs to be deployed.
And that doesn't even get into having to deploy new DNS code everywhere.