You can do all of this locally on a cheap video card. Search for fooocus or automatic1111 for a couple of setups that are fairly low friction to get going. Amuse AI is another one. It's not quite state of the art and also censored, but it's by far the least friction (especially if you have an AMD card) - it's pretty much plug and play. ComfyUI is the advanced do-everything workhorse. However, it's anything but comfy if you don't already have a lot of knowledge about this domain. I'd generally recommend fooocus for a balance between usability and power/flexibility.

The million image gen services online are mostly just making bank off ignorance. People don't realize that their own cheap video cards are more than enough to do everything they're paying a service an orders of magnitude markup for.