๐ Frameworks
AI generated
New framework for analyzing the consistency-accuracy relation of LLMs under controlled input variations
## Introduction
A new framework has been introduced for evaluating the consistency-accuracy relation of LLMs under controlled input variations. The framework proposes a global metric that combines the CAR curve to quantify the trade-off between accuracy and consistency.
## How CAT works
The CAT (Consistency-Accuracy Relation) is a reference frame that visualizes how model accuracy varies with increasing consistency requirements, as defined by the MCA metric. The framework also proposes the CORE index, a global metric that combines the area and shape of the CAR curve to quantify the trade-off between accuracy and consistency.
## Application of CAT
The CAT has been applied to a diverse set of generalist and domain-specific LLMs, evaluated on multiple MC benchmarks. The result has demonstrated the effectiveness of the framework in evaluating consistency-accuracy of LLMs.
## Extension of CAT
The CAT can be extended to support long-form, open-ended evaluations through adaptable scoring functions.
๐ฌ Commenti (0)
๐ Accedi o registrati per commentare gli articoli.
Nessun commento ancora. Sii il primo a commentare!