uv add google-genai
    uv run scripts/run_benchmarks.py --models google/gemini-2.5-pro --formats markdown_kv --limit 100
And add GOOGLE_API_KEY=<your-key-here> to a file called .env in the repo root.

Unfortunately I started getting "quota exceeded" almost immediately, but it did give 6/6 correct answers before it crapped out.

Thanks! That worked perfectly.

100 samples:

- gemini-2.5-pro: 100%

- gemini-2.5-flash: 97%