ARC-AGI-2: A Transformer for Symbolic Reasoning

A new study published on arXiv presents a system based on the Transformer architecture designed to address the Abstraction and Reasoning Corpus (ARC), a benchmark that evaluates the ability of models to generalize beyond simple pattern matching. The goal is to infer symbolic rules from a limited number of examples.

Architecture and Methodology

The proposed system combines neural inference with structure-aware priors and online task adaptation. The approach is based on four key ideas:

  1. Reformulation of ARC reasoning as a sequence modeling problem, using a compact task encoding with only 125 tokens.
  2. Introduction of an augmentation framework based on group symmetries, grid traversals, and automata perturbations.
  3. Application of test-time training (TTT) with lightweight LoRA adaptation, allowing the model to specialize on each task.
  4. Design of a decoding and scoring pipeline that aggregates likelihoods across augmented task views.

Results

The final system demonstrates a significant improvement over Transformer baselines and surpasses previous neural ARC solvers, approaching human-level generalization. The components work synergistically: augmentations expand the hypothesis space, TTT sharpens local reasoning, and symmetry-based scoring improves solution consistency.

For those evaluating on-premise deployments, there are trade-offs to consider carefully. AI-RADAR offers analytical frameworks on /llm-onpremise to evaluate these aspects.