# Introduction Predicting treatment outcomes for lung cancer remains a challenge due to the sparsity, heterogeneity, and information overload of real-world electronic health data. A team of researchers has developed a new framework that uses large language models to transform laboratory, genomic, and medication data into high-fidelity features to improve treatment outcome prediction. # Methodology The new framework uses Large Language Models (LLMs) as Goal-oriented Knowledge Curators (GKC) to convert laboratory, genomic, and medication data into high-fidelity features. GKC produces task-aligned representations tailored to the prediction objective and operates as an offline preprocessing step that integrates naturally into hospital informatics pipelines. # Results Results have been published on arXiv and show that the quality of semantic representation is a key determinant of predictive accuracy in sparse clinical data settings. The new framework demonstrates a scalable, interpretable, and workflow-compatible pathway for advancing AI-driven decision support in oncology.