RxnNano: A 0.5B Parameter Chemical LLM
A new study published on arXiv introduces RxnNano, a compact large language model (LLM) designed for chemical reaction prediction and retrosynthesis. The model, with only 0.5 billion parameters, demonstrates superior performance compared to much larger models (over 7 billion parameters) in specific tasks.
Hierarchical Learning Approach
RxnNano stands out for its hierarchical learning approach, which aims to instill a deep chemical understanding in the model. This approach is based on three main innovations:
- Latent Chemical Consistency: Models reactions as movements on a continuous chemical manifold, ensuring reversible and physically plausible transformations.
- Hierarchical Cognitive Curriculum: Trains the model through progressive stages, from syntax mastery to semantic reasoning, building robust chemical intuition.
- Atom-Map Permutation Invariance (AMPI): Forces the model to learn invariant relational topology and balance multi-task learning.
Performance and Results
RxnNano demonstrated a 23.5% improvement in Top-1 accuracy on rigorous benchmarks, without the use of test-time augmentation techniques. This result underscores the effectiveness of the approach focused on chemical understanding compared to simple model scaling.
๐ฌ Comments (0)
๐ Log in or register to comment on articles.
No comments yet. Be the first to comment!