James Routley

What is ARC-AGI-3?

ARC-AGI-3 is an interactive reasoning benchmark which challenges AI agents to explore novel environments, acquire goals on the fly, build adaptable world models, and learn continuously.

A 100% score means AI agents can beat every game as efficiently as humans.

Instead of solving static puzzles, agents must learn from experience inside each environment—perceiving what matters, selecting actions, and adapting their strategy without relying on natural-language instructions.

How it measures intelligence

100% human-solvable environments
Skill-acquisition efficiency over time
Long-horizon planning with sparse feedback
Experience-driven adaptation across multiple steps

As long as there is a gap between AI and human learning, we do not have AGI.

ARC-AGI-3 makes that gap measurable by testing intelligence across time, not just final answers—capturing planning horizons, memory compression, and the ability to update beliefs as new evidence appears.

Design principles

Easy for humans to pick up quickly
No pre-loaded knowledge or hidden prompts
Clear goals + meaningful feedback
Novelty that prevents brute-force memorization

Features

ARC-AGI-3 includes replayable runs, a developer toolkit for agent integration, and a UI designed for transparent evaluation.

Replays + Evaluation

Inspect agent behavior through preview replays—track decisions, actions, and reasoning in a structured timeline.

Browse a sample replay

Tools + UI

Integrate your agent using the ARC-AGI-3 toolkit, then use the interactive UI to test and iterate.

Play and test

Docs

Everything you need to build agents: environments, API usage, and integration guidance.

Read the docs

ARC-AGI-3

Links

What is ARC-AGI-3?

How it measures intelligence

Design principles

Features

Replays + Evaluation

Tools + UI

Docs

PUT YOUR AGENT TO THE TEST!

ARC-AGI-3

Links

What is ARC-AGI-3?

How it measures intelligence

Design principles

Features

Replays + Evaluation

Tools + UI

Docs

PUTPUT YOUR YOUR AGENT AGENT TO THE TO THE TEST! TEST!

PUT YOUR AGENT TO THE TEST!