The x86 architecture and instruction set is complex - so it absolutely needs a powerful assembler to help prevent mistakes.

That doesn't seem right to me. If the problem is that it has too many instructions and addressing modes, you can decide to only use a small subset of those instructions and addressing modes, which really isn't much of a handicap for implementing functionality. (It doesn't help with analyzing existing code, but neither does a powerful assembler.)

I'm no expert on assembly-language programming, but probably 90% of the assembly I write on i386, amd64, RISC-V, and ARM is about 40 instructions: ldr, mov, bl, cmp, movs, push, pop, add, b/jmp, bl/blx/call, ret, str, beq/jz, bne/jnz, bhi/ja, bge/jge, cbz, stmia, ldmia, ldmdb, add/adds, addi, sub/subs, bx, xor/eor, and, or/orr, lsls/shl, lsrs/sar/shr, test/tst, inc, dec, lea, and slt, I think. Every once in a while you need a mul or a div or something. But the other 99% of the instruction set is either for optimized vectorized inner loops or for writing operating system kernels.

I think that the reason that i386 assembly (or amd64 assembly) is error-prone is something else, something it has in common with very simple architectures and instruction sets like that of the PDP-8.

> I think that the reason that i386 assembly (or amd64 assembly) is error-prone is something else, something it has in common with very simple architectures and instruction sets like that of the PDP-8.

What reason is that? (And, if it's not obvious, what are ARM/RISC-V doing that make them less bad?)

I don't think they're particularly less bad. All five architectures just treat memory as a single untyped array of integers, and registers as integer global variables. Their only control structure is goto (and conditional goto.) If you forget to pass an argument to a subroutine, or pass a pointer to an integer instead of the integer, or forget to save a callee-saved register you're using, or do a signed comparison on an unsigned integer, or deallocate memory and then keep using it, or omit or jump past the initialization of a local variable, conventional assemblers provide you no help at all in finding the bug. You'll have to figure it out by watching the program give you wrong answers, maybe single-stepping through it in a debugger.

There are various minor details of one architecture or the other that make them more or less bug-prone, but those are minor compared to what they have in common.

None of this is because the instruction sets are complex. It would be closer to the mark to say that it's because they are simple.

I have a postit note idea that says simply "typesafe macro assembler".

I've not fleshed this out yet, but I think a relatively simple system would help deal with all the issues you mention in the first paragraph while allowing escape hatches.

Check out typed assembly languages like TALx86.

https://en.wikipedia.org/wiki/Typed_assembly_language

You could call it "C".