Audio normalizing seems to be quite hard for this kind of use. I tried this 20 yrs ago with parts of a (mouse-)pointer device, a rigid nylon strap around the chest and 8 cm of flexible length in between. Mouse pointer wheel (one dimension only) was driven by change of length. For some hours of fiddling it was quite reliable.

That`s very cool. I chose the phone mic, because everybody already has one, no extra device needed. The downside is normalization: from room noise to breathing style. I do use an ML layer to deal with some of these variations but only after basic signal checks. The hard part is to decide when the microphone signal is too ambiguous to use at all...