Ontologies for LLMs: A Neuro-Symbolic Approach

Language models (LLMs) exhibit fundamental limitations such as hallucination, brittleness, and lack of formal grounding, which are particularly problematic in specialist fields requiring verifiable reasoning. A recent study explores whether formal domain ontologies can enhance language model reliability through retrieval-augmented generation.

Mathematics as a Testbed

The research uses mathematics as a proof of concept, implementing a neuro-symbolic pipeline leveraging the OpenMath ontology with hybrid retrieval and cross-encoder reranking to inject relevant definitions into model prompts.

Results and Challenges

Evaluation on the MATH benchmark with three open-source models reveals that ontology-guided context improves performance when retrieval quality is high, but irrelevant context actively degrades it. This highlights both the promise and challenges of neuro-symbolic approaches.