RxnNano: Training Compact LLMs for Chemical Reaction and Retrosynthesis Prediction

RxnNano: A 0.5B Parameter Chemical LLM

A new study published on arXiv introduces RxnNano, a compact large language model (LLM) designed for chemical reaction prediction and retrosynthesis. The model, with only 0.5 billion parameters, demonstrates superior performance compared to much larger models (over 7 billion parameters) in specific tasks.

Hierarchical Learning Approach

RxnNano stands out for its hierarchical learning approach, which aims to instill a deep chemical understanding in the model. This approach is based on three main innovations:

Latent Chemical Consistency: Models reactions as movements on a continuous chemical manifold, ensuring reversible and physically plausible transformations.
Hierarchical Cognitive Curriculum: Trains the model through progressive stages, from syntax mastery to semantic reasoning, building robust chemical intuition.
Atom-Map Permutation Invariance (AMPI): Forces the model to learn invariant relational topology and balance multi-task learning.

Performance and Results

RxnNano demonstrated a 23.5% improvement in Top-1 accuracy on rigorous benchmarks, without the use of test-time augmentation techniques. This result underscores the effectiveness of the approach focused on chemical understanding compared to simple model scaling.

RxnNano: Training Compact LLMs for Chemical Reaction and Retrosynthesis Prediction

RxnNano: A 0.5B Parameter Chemical LLM

Hierarchical Learning Approach

Performance and Results

💻 Need GPU Cloud Infrastructure?

💬 Comments (0)

🔍 Continue Exploring

Explore LLM On-Premise

LLM Workstation: Which Setup Under $5k?

Latent Sculpting for Zero-Shot Generalization: A Manifold Learning Approach to Out-of-Distribution Anomaly Detection

UG student launches Dhi-5B, LLM trained from scratch on a budget