I do think this is pretty much the one use case for a true "portable assembler", where it basically is assembly except the compiler will do the register allocation and instruction selection for you (so you don't have to deal with, e.g., the case that add32 y, x, 0xabcdef isn't an encodable instruction because the immediate is too large).