Link to paper here https://arxiv.org/pdf/2506.21734
Still reading, but the benchmarks for ARC-AGI-1, ARC-AGI-2, Sudoku-Extreme (9x9), and Maze-Hard (30x30) look impressive.
Link to paper here https://arxiv.org/pdf/2506.21734
Still reading, but the benchmarks for ARC-AGI-1, ARC-AGI-2, Sudoku-Extreme (9x9), and Maze-Hard (30x30) look impressive.
on gh someone reproduced but paper lacks total gpu hours and their benchmark results where 10-20% lower (read on gh issue)