2 / 6 · What’s possible
The AI you already have
is capable of much more than it shows you by default.
ARC-AGI-3 · Public reasoning benchmark
0%
Claude Opus 4.6, default
→
94.85%
Same model, structured differently
25 interactive reasoning games. No fine-tuning.
The difference isn’t the model.
It’s the structure around it.
Web4 is that structure.