Maybe the agents should require some sort of input start token: "simon says"