how is it hard, do a A to D, add a filter, do compute, then do D to A.

Not hard to do, hard to do well. Hiding all complexity with a hand wavey “do compute” doesn’t make that bit easy

Yeah i get it, the details are hard.

The article covers that.

In short, audio and visual perception do not map perfectly. Humans don't have a linear perception of either so a perfect A to D then D to A conversion yields unsatisfying results.