LT-Tuning: A New Paradigm for LLM Reasoning

The reasoning capability of Large Language Models (LLMs) is often implemented through Chain-of-Thought (CoT), an approach that requires models to verbalize each intermediate step in text tokens. However, this limits reasoning to the discrete space of the vocabulary.

Recently, reasoning in continuous latent spaces has emerged as a promising alternative, enabling more robust inference and flexible computation. However, current latent paradigms often suffer from feature collapse and instability.

The Latent Thoughts Tuning Approach

To address these challenges, Latent Thoughts Tuning (LT-Tuning) has been proposed, a framework that redefines how latent thoughts are constructed and deployed. Instead of relying solely on raw hidden states, LT-Tuning introduces a Context-Prediction-Fusion mechanism that jointly leverages contextual hidden states and predictive semantic guidance from the vocabulary embedding space.

Combined with a progressive three-stage curriculum learning pipeline, LT-Tuning also enables dynamically switching between latent and explicit thinking modes. Experiments demonstrate that this method outperforms existing latent reasoning baselines, effectively mitigating feature collapse and achieving robust reasoning accuracy.