RxnNano: A 0.5B Parameter Chemical LLM

A new study published on arXiv introduces RxnNano, a compact large language model (LLM) designed for chemical reaction prediction and retrosynthesis. The model, with only 0.5 billion parameters, demonstrates superior performance compared to much larger models (over 7 billion parameters) in specific tasks.

Hierarchical Learning Approach

RxnNano stands out for its hierarchical learning approach, which aims to instill a deep chemical understanding in the model. This approach is based on three main innovations:

  1. Latent Chemical Consistency: Models reactions as movements on a continuous chemical manifold, ensuring reversible and physically plausible transformations.
  2. Hierarchical Cognitive Curriculum: Trains the model through progressive stages, from syntax mastery to semantic reasoning, building robust chemical intuition.
  3. Atom-Map Permutation Invariance (AMPI): Forces the model to learn invariant relational topology and balance multi-task learning.

Performance and Results

RxnNano demonstrated a 23.5% improvement in Top-1 accuracy on rigorous benchmarks, without the use of test-time augmentation techniques. This result underscores the effectiveness of the approach focused on chemical understanding compared to simple model scaling.