I have a ML project. I usually set up a team of agents, where I have a leader, archivist, research assistant, researcher, developer and tester. The team generates hypothesis based on papers, test it, and iterate over that. Everything is documented using a lab notebook. It burns tokens but I have found some promising strategies that I am testing.