PACE: A Neuro-Symbolic Framework for Realistic and Constrained Counterfactual Explanations

Explaining why a machine learning system made a certain decision is one of the thorniest challenges in explainable AI. But if the proposed explanation is not feasible in the real world—because it suggests actions forbidden by internal policies or incompatible with the user’s reality—it loses all practical value. This is a well-known issue for those working with counterfactual explanations: minimal input changes that flip a model’s outcome, yet often ignore domain constraints.

The PACE framework, described in a new study, tackles the problem by bridging two worlds: neural networks for classification and symbolic reasoning to enforce sensible rules. The architecture is modular: a predictive model (in the experiment, a multilayer perceptron) produces the initial prediction, while an Answer Set Programming (ASP) layer handles the generation of plausible alternatives, discarding changes that violate pre-established conditions. For instance, it can alter education level or working hours, but not immutable attributes such as age or gender.

How PACE works

The core of the approach is the clean separation between the neural and symbolic components. The former remains trained solely on data, while the latter encodes domain knowledge in declarative form. When searching for a counterfactual explanation, the system does not just find the smallest change that reverses the decision: it verifies that the change is actually admissible according to the defined rules. The benefit is twofold: explanations become more plausible for a human, and—crucially—the risk of unfeasible or even harmful suggestions is reduced.

The setup is model-agnostic, hence applicable to different classifiers and domains. The case study on the Adult Income dataset—a classic benchmark for predicting whether an individual exceeds an income threshold—clearly illustrates the trade-off at play. With strict symbolic constraints, the proportion of “valid” explanations (those that truly change the prediction) may slightly decrease, but the plausibility and feasibility of the ones produced rise significantly. In other words, some coverage is sacrificed to obtain recommendations that actually make sense in the real context.

Validity vs. plausibility: the trade-off

The results highlight a trade-off that should give pause to those designing decision support systems. In traditional settings, pure validity—“change this and get the opposite outcome”—is often the only metric considered. But suggesting that a user increase weekly working hours is one thing; proposing an absurd value like 200 hours is another. PACE channels the counterfactual search along realistic tracks, showing that integrating symbolic knowledge improves explanation quality without upending the underlying predictive architecture.

For those working in regulated sectors—finance, healthcare, public administration—where models often run on-premise to guarantee data sovereignty and confidentiality, this ability to embed business rules directly into the explanation process is far from secondary. It means being able to justify automated decisions not only with “the model detected this pattern,” but with “the suggested modification is consistent with company policies and applicable regulations.” A step forward toward AI that not only works, but also knows how to reason in a context-aware manner.