# The Trojan in the Vocabulary: Stealthy Sabotage of LLM Composition The LLM composition system is increasingly dependent on model composition techniques that remix capabilities from diverse sources. A new attack discovered on this system may compromise model security. Researchers have created a "breaker token" that, when transplanted into a base model, can sabotage the model's functionality without altering its utility. This attack introduces a vulnerability in the supply chain and questions LLM model security. ## Attacks and Vulnerabilities The attack is possible due to the breaker token's ability to be functionally inert in a donor model but reliably reconstructing into a high-salience malicious feature after transplant into a base model. This creates an asymmetric realizability gap that sabotages the base model's generation without altering its utility. ## Formalization and Attacks Researchers have formalized this attack as a dual-objective optimization problem and instantiated it using a sparse solver. The attack is training-free and achieves spectral mimicry to evade outlier detection, while demonstrating structural persistence against fine-tuning and weight merging. ## Risk in the Supply Chain This attack questions LLM model security and introduces a vulnerability in the supply chain. It is essential that researchers and developers continue to monitor and improve LLM model security. ## Code Available The attack code is available on GitHub https://github.com/xz-liu/tokenforge

The Trojan in the Vocabulary: Stealthy Sabotage of LLM Composition

💬 Commenti (0)

📚 Approfondimenti

Approfondisci su LLM On-Premise

I modelli di intelligenza artificiale più vulnerabili: cosa significa per l'industria

Il segreto dei modelli LLM: scopri come gli tokenizer influenzano la loro prestazione

Nuova tecnologia per superare gli ostacoli nell'analisi delle piante