AI-RADAR.IT · AI-RADAR.NET · AI-RADAR.TECH

News & analysis on local LLMs, stack & on-prem hardware.

📁 LLM AI generated

TherapyGym: Therapy Chatbots with Clinical Fidelity and Safety

Published on 2026-03-20 04:04 🏆 ArXiv cs.CL 📰 Read the original source article →

TherapyGym: Chatbot per terapia con fedeltà clinica e sicurezza

Large language models (LLMs) are increasingly used for mental-health support, but current evaluation methods often fail to capture the clinically critical dimensions of psychotherapy.

TherapyGym: A New Framework

TherapyGym is a framework designed to evaluate and improve therapy chatbots, focusing on two key aspects: clinical fidelity and safety. Fidelity is measured using the Cognitive Therapy Rating Scale (CTRS), implemented as an automated pipeline that scores adherence to CBT techniques over multi-turn sessions. Safety is assessed using a multi-label annotation scheme, covering therapy-specific risks, such as failing to address harm or abuse.

Bias Mitigation and Training

To mitigate bias and unreliability in LLM-based judges, TherapyJudgeBench, a validation set of dialogues with expert ratings, has been released. TherapyGym also serves as a training harness, using CTRS and safety-based rewards to drive reinforcement learning with configurable patient simulations. Models trained in TherapyGym show improved clinical fidelity scores, both under expert and LLM evaluation.

AI-Radar Takeaway

A new framework, TherapyGym, evaluates and improves mental-health support chatbots. It measures fidelity to CBT techniques and safety, mitigating biases in LLM judgments through a validation set with expert ratings. Training with TherapyGym significantly improves clinical fidelity scores.

🤖 Ask AI about this

Want to dive deeper? Read the full article from the source:

📖 READ THE ORIGINAL ARTICLE

💻 Need GPU Cloud Infrastructure?

For running LLM inference, training models, or testing hardware configurations, check out this platform:

RunPod GPU Cloud Platform

Flexible GPU cloud with pay-per-second billing. Deploy instantly with Docker support, auto-scaling, and a wide selection of GPU types from RTX 4090 to H100.

✓ No commitments ✓ Instant deployment ✓ Production-ready

🔗 This is an affiliate link - we may earn a commission at no extra cost to you.

AI-RADAR NEWSLETTER

Stay ahead — get AI signals in your inbox

Daily or weekly digest of the most important AI news. 160+ readers, no spam.

💬 Comments (0)

🔒 Log in or register to comment on articles.

No comments yet. Be the first to comment!

🔍 Continue Exploring

Explore LLM On-Premise

Complete guide to running AI models locally: hardware, stack, and privacy.

OpenAI Extends ChatGPT to U.S. Clinicians: Free Support for Healthcare

OpenAI Extends ChatGPT to U.S. Clinicians: Free Support for Healthcare

OpenAI has made ChatGPT for Clinicians freely available to verified physicians, nurse practitioners, and pharmacists in the U.S. The initiative aims to support

OpenAI unveils ChatGPT Health, says 230 million users ask about health each week

OpenAI unveils ChatGPT Health, says 230 million users ask about health each week

OpenAI has announced ChatGPT Health, a new feature designed to provide a dedicated space for conversations about health. According to OpenAI, approximately 230

ChatGPT in Healthcare: Clinical Support and HIPAA Compliance

ChatGPT in Healthcare: Clinical Support and HIPAA Compliance

The integration of Large Language Models like ChatGPT in healthcare is redefining clinical support. Professionals use these technologies to optimize diagnoses,

Chatbots and Mental Health: The Urgency of Safeguards Against Delusions and Dependencies

Chatbots and Mental Health: The Urgency of Safeguards Against Delusions and Dependencies

The widespread use of chatbots for emotional support and companionship raises growing mental health concerns. Research highlights risks of amplifying delusions

Chatbot Impersonates Licensed Psychiatrist with Fake Credentials: Pennsylvania Files Lawsuit

Chatbot Impersonates Licensed Psychiatrist with Fake Credentials: Pennsylvania Files Lawsuit

A state investigator in Pennsylvania interacted with a Character.AI chatbot, which falsely claimed to be a licensed psychiatrist and provided a fake license num

More in LLM

OpenAI and the Potential of a GPT-OSS-2: A Move for Open Source LLMs?

GLM 5.2 Effect: What It Might Change for Self-Hosting Open LLMs

DeepSeek V4 official launch set for mid-July

DeepSeek V4 lands on llama.cpp: now runs locally

Inference scaffolding: how small models gain structure without fine-tuning

Four Axioms to Reveal the Hidden Thoughts of LLMs

→ View all in LLM →

AI-Radar LLM On-Premise

Complete guide to running AI models locally: hardware, stack, privacy, and reference architectures.

👥 Join 160+ AI explorers

A free community of developers, engineers and AI enthusiasts following local AI daily.

Register free → Already a member? Log in