I built a platform to learn how to build personal AI agents and test them with fast feedback. It is free for individuals and small teams.
Platform deterministically generates tasks, creates environments for them, observes AI agents and then scores them (not LLM as a judge).
We just ran a worldwide hackathon (800 engineers across 80 cities). Ended up creating more than 1 million runtimes (each task runs in its own environment) and crashing the platform halfway.
104 tasks from the challenge on building a personal and trustworthy AI agent are open now for everyone.
To get started faster you can use a simple SGR Next Step agent: https://github.com/bitgn/sample-agents