Is there a way in the DSP (that's the only one I looked at) to instead of going through a mux at the end just put the output flop optionally in a transparent mode if registering isn't enabled? I don't know if that's possible with the tooling but it seems like it'd save resources and reduce fanout.