Pretty doubtful about computer use/screenshotting based approaches.
With Retriever AI, we construct custom accessibility trees to represent web pages and just switched over to using DeepSeek v4 Flash and its nearing 100x cost decrease.
We also had great success just reverse engineering the underlying APIs of websites and then writing code to hit them. This approach of using screenshots to take actions on a webpage to trigger the underlying network calls the website is making seems too naive.
What happens when you need to control something that isn't a web page?
Honestly with Fable I think anyone is going to be able to reverse engineer a desktop app and get the coding agent to automate it.
The Codex computer use functionality actually uses OS level accessibility trees, so thats also possible without screenshots.
I can't tell if you're saying that you think every native app in the world can be vibe-replaced with a web app (they can't) or if you're saying it will be easy to vibe-code a reverse-engineered replacement for every native app (you can't)
Reverse engineering APIs is just a recipe to get blocked sooner. Good luck!
its been working fine on LinkedIn/IG, the trick is to make the requests from the main world of the website itself.
Like I said. Good luck with it. You’re most likely violating ToS and it’s always a game of cat and mouse.