Catastrophic Forgetting in LLMs: A Self-Generated Solution

Adapting large language models (LLMs) to specific tasks through fine-tuning often leads to a problem known as catastrophic forgetting: the loss of the model's general capabilities. New research proposes SA-SFT, a self-augmentation routine that aims to solve this issue.

SA-SFT: Self-Dialogues for Resilience

SA-SFT involves the LLM generating self-dialogues prior to fine-tuning. This self-authored data is then mixed with the task-specific data. Surprisingly, this approach does not require external data or modifications to the optimization and training procedures.

Results and Implications

The results show that SA-SFT effectively mitigates catastrophic forgetting, maintaining performance comparable to the original model and outperforming common baselines in many scenarios. Theoretical analysis suggests that forgetting may stem from style-induced parameter drift, and that self-alignment through self-generated data counteracts this effect.