TrustifAI is an innovative framework that aims to solve a crucial problem in the use of large language models (LLMs): quantifying and explaining the reliability of their responses.

The Problem of Hallucinations

RAG (Retrieval-Augmented Generation) systems can generate responses that sound correct but are not actually supported by the underlying data, a phenomenon known as "hallucination." A single correctness or relevance score is not sufficient, especially in enterprise, regulated, or governance-heavy environments. It is essential to understand why a response fails.

The TrustifAI Solution

TrustifAI introduces a multi-dimensional approach to assessing the trustworthiness of AI responses. Instead of a simple "pass/fail" judgment, the framework calculates a "Trust Score" based on several signals:

  • Evidence Coverage: Is the answer actually supported by the retrieved documents?
  • Epistemic Consistency: Does the model remain stable across repeated generations?
  • Semantic Drift: Did the response drift away from the given context?
  • Source Diversity: Is the answer overly dependent on a single document?
  • Generation Confidence: How confident was the model while generating the answer?

Traceability and Explainability

TrustifAI doesn't just give you a number; it provides traceability through Reasoning Graphs (DAGs) and visualizations that show why a response was flagged as reliable or suspicious.

Differences from LLM Evaluation Frameworks

Unlike existing evaluation frameworks, which measure the overall quality of a RAG system, TrustifAI focuses on explaining why a specific answer should or should not be considered trustworthy.

The project is open source and available on GitHub. Installation is simple via pip install trustifai.