For future ML developers: A post like this should include system requirements.

It's not clear from the blog post, the git page, and most other places if this will run on, even in big-O:

* CPU

* 16GB GPU

* 240GB server (of the type most business can afford)

* Meta/Google/Open AI/Anthropic-style data center

Indeed. I've tried to run it locally this but couldn't get it running on my measly gaming-spec workstation.

It's seems you need lot's of ram and vram. Reading the issues on github[1], it does not seem many others have had success in using this effectively:

- someone with a 96 Gb VRAM RTX 6000 Pro had cuda oom issues

- someone somehow made it work on a RTX 4090 somehow, but RTF processing time was 12...

- someone with a RTX 5090 managed to use it, but with clips no longer than 20s

It seems utility of the model for hobbyist with consumer grade cards will be low.

[1]: https://github.com/facebookresearch/sam-audio/issues/24

It realy depends on your runtime environment, but I agree it would be nice to have some references with commonly used setups.

It does, but my comment was "even in big-O."

Environments might mean the difference between e.g. 16GB and 24GB, but not 16GB and 160GB.