Voicebox: Open-Source Local Voice Cloning

A developer has released Voicebox, an open-source application for voice cloning aiming to be an "Ollama for voice." The project leverages Qwen3-TTS for fast and local voice cloning, combined with Whisper for transcription.

Main Features

Voicebox is a native desktop application (Tauri/Rust/Python) designed to be lightweight and without complex dependencies. Its features include:

  • Instant voice cloning with Qwen3-TTS (supports single or multiple samples).
  • DAW-style multi-track timeline for composing conversations and podcasts.
  • System and microphone audio recording with integrated Whisper transcription.
  • REST API and one-click local server for integration into games and applications.

The source code is available on GitHub under the MIT license. Downloads are available for macOS and Windows, with a Linux version coming soon. The developer plans to add support for other models such as XTTS and Bark in the future.