That’s true, I’m trying to figure out a better testing environment with a feedback loop.
I did try letting the models iterate on the bot code based on a summary of an end-of-game ‘report’, but that showed only marginal improvements vs. zero-shot
That’s true, I’m trying to figure out a better testing environment with a feedback loop.
I did try letting the models iterate on the bot code based on a summary of an end-of-game ‘report’, but that showed only marginal improvements vs. zero-shot
In my mind, I’d give it the following:
Step(n) - up to n steps forward
RunTil(movement|death|??) - iterate until something happens
Board(n) - board at end of step n
BoardAscii(n) - ascii rep of same
Log(m,n) - log of what happened between step m and n
Probably all this could be accomplished with a state structure and a rendering helper.
Do you let humans review opposing team’s code?