If you have inference running on this new 128GB RAM Mac, wouldn't you still need another separate machine to do the manual work (like running IDE, browsers, toolchains, builders/bundlers etc.)? I can not imagine you will have any meaningful RAM available after LLM models are running.
No? First of all you can limit how much of the unified RAM goes into VRAM, and second, many applications don't need that much RAM. Even if you put 108 GB to VRAM and 16 to applications, you'll be fine.