It would probably help you to compare what you can do on a phone vs what you can do with desktop software (Lightroom/Photoshop, DxO, Topaz, CaptureOne, etc). It's generally quite good, with the exception of challenging liminal areas (e.g. hair, foliage).
Fwiw, Topaz -- which I have a license for but essentially never use -- has pretty incredible denoising & upsizing features (for both photo & video), but to get the optimal quality output you offload the processing to their cloud infra (and buy credits from them to pay for it). It's roughly the equivalent of a SWE using a local LLM that's good enough" vs a frontier model that's SOTA but requires a consumption-based subscription.
Interesting, so it seems to be an issue with heavy compute or RAM requirements.
I've got a 61mp camera, and an RX 7900XT. It takes about 15s/picture for DXO to denoise, which is a lot longer than people are willing to wait on a phone to take a photo. Topaz is even slower. A cloud service could be used to do it in post, but someone has to pay for that.
Yet in modern computer games, modern graphics cards denoise a scene in real-time at 60 frames per second using machine learning models [1][2] while doing all the other rendering at the same time. Granted, that's ray tracing, and the resolution is lower, and they technically cheat by using additional information, but it might be that DXO is not optimized very well.
1: https://blogs.nvidia.com/blog/ai-decoded-ray-reconstruction/
2: https://gpuopen.com/amd-fsr-rayregeneration/
Games will typically un at 4k or less which is about 8mp on the other hand it’s difficult to buy a stills camera with less than 20 mp, and >40mp is common. Most algorithms are n^2 in graphics as well so we wouldn’t expect a linear speed up. I’ve tried dxo, Lightroom, and topaz they all perform about the same so I don’t think it’s particularly unoptimized.