Qwable-v1: The Open-Weights LLM Capturing Claude Fable-5's Essence

Qwable-v1: An Open-Weights LLM with a Complex Legacy

The landscape of Large Language Models (LLMs) is constantly evolving, with increasing interest in solutions that offer greater control and flexibility. In this context, Qwable-v1 has been released, a new open-weights LLM that stands out due to its origin: it was distilled from Claude Fable-5, an Anthropic Mythos-class model that had a brief but significant public appearance.

Claude Fable-5, when it was made available for approximately four days between June 9 and June 12, 2026, was considered Anthropic's most powerful model, achieving a score of 80.3% on SWE-bench Pro. Its availability was globally suspended due to U.S. export-control directives, an event that underscores the geopolitical and regulatory complexities that can influence access to advanced AI technologies. This incident has reignited the debate on data sovereignty and the need for self-hosted alternatives for businesses.

Technical Details and the Distillation Process

Qwable-v1 was built upon Qwen3.6-35B-A3B, a model already well-known in the open-source community. The distillation process involved extracting 4,659 cleartext agentic-coding traces from the Glint-Research/Fable-5-traces corpus, the only public source where Fable-5's "thinking blocks" (CoT) were accessible. This dataset was crucial for transferring the original model's reasoning and tool-use capabilities.

The distillation was performed in approximately 14 hours on a single NVIDIA H200 GPU, highlighting how complex capabilities can be replicated on dedicated, albeit high-end, hardware. The result is a model capable of emitting formatted XML for Claude-style tool use, such as str_replace_editor, suggesting that Fable-5's tool surface was directly incorporated into the model's weights, not just its style. Qwable-v1, its GGUFs (in IQ4_XS, Q4_K_M, Q5_K_M, Q8_0 variants), and the SFT dataset are publicly available on Hugging Face, under an AGPL-3.0 license, offering a robust option for local deployment.

Implications for On-Premise Deployment and Data Sovereignty

The availability of Qwable-v1 as an open-weights model, with various quantization options via GGUF, is particularly relevant for CTOs, DevOps leads, and infrastructure architects evaluating self-hosted AI solutions. The ability to run a model with capabilities derived from a leading LLM like Fable-5 on proprietary hardware, such as a single H200, offers unprecedented control over data and model execution.

This approach directly addresses the needs for data sovereignty, compliance, and security, especially for air-gapped environments or sectors with stringent regulations. The Fable-5 incident, suspended for export control reasons, serves as a warning about the risks associated with exclusive reliance on proprietary cloud services. For those evaluating on-premise deployments, AI-RADAR offers analytical frameworks on /llm-onpremise to assess trade-offs between costs, performance, and control, and Qwable-v1 positions itself as a concrete alternative to mitigate such risks.

Future Prospects and Trade-offs

The emergence of models like Qwable-v1 highlights a clear trend in the AI industry: the pursuit of a balance between the power of large models and the need for control and autonomy. While distillation may entail some loss of fidelity compared to the original model, the benefits in terms of TCO, security, and customization for specific workloads can far outweigh these compromises.

Enterprises investing in on-premise AI infrastructure can leverage models like Qwable-v1 to develop innovative applications while maintaining full ownership and management of their digital assets. The choice between a cloud-based LLM and a self-hosted solution depends on a complex evaluation of technical constraints, regulatory requirements, and strategic objectives, but the offering of distilled open-weights models continues to enrich the options available to technology decision-makers.