Doubling speed can likely come from MoE optimizations such as reducing the amount of active parameters.