Predicting Post-Stroke Outcomes with Large Language Models
A recent study explored the use of large language models (LLMs) to predict functional outcomes following acute ischemic stroke. The research focuses on the ability of these models to infer modified Rankin Scale (mRS) scores directly from patient admission notes.
Several LLMs were evaluated, including encoder models (BERT, NYUTron) and generative models (Llama-3.1-8B, MedGemma-4B), in both frozen and fine-tuned configurations. The models were tested on a large, real-world stroke registry, using discharge and 90-day follow-up notes.
Performance and Results
Fine-tuned Llama achieved the highest performance, with 90-day exact mRS accuracy of 33.9% and binary accuracy (mRS 0-2 vs. 3-6) of 76.3%. Llama's performance was comparable to that of structured-data models incorporating NIHSS and patient age.
The results suggest that fine-tuned LLMs can predict post-stroke functional outcomes based solely on admission notes, achieving performance comparable to models that require the extraction of structured variables. This paves the way for the development of text-based prognostic tools that integrate into clinical workflows without the need for manual data extraction.
For those evaluating on-premise deployments, there are trade-offs to consider. AI-RADAR offers analytical frameworks at /llm-onpremise to evaluate these options.
๐ฌ Commenti (0)
๐ Accedi o registrati per commentare gli articoli.
Nessun commento ancora. Sii il primo a commentare!