Precisely. Even inflated if the inflated 16,000 api calls was accurate for how much the cost of mediocre GPU would get you, that’s not an endless store of api calls. I’m also on a 4080 for lighter loads, and even just writing benchmarks, exploring attention mechanisms, token salience, etc, without image gen being my specific purpose I may trash half a thousand generations from output every few days. More if I count the stuff that never made it that far too.