The other thing I suspect is that "Just give me True/False" cuts off a large amount of the search space a modern-day LLM uses to help it answer questions (you can see it in reasoning traces but the act of writing the explanation helps guide it toward a better answer and gives it better likelihood it backtracks on a bad decision).
If you let it spew out an explanation along with the answer, I'm curious if the accuracy will improve (I suspect it will).
Good point. Will publish in the next version also the results with a prompt that allows the models to "think out loud" before providing the final verdict.