📁 LLM AI generated

SoproTTS v1.5: Zero-Shot Voice Cloning TTS for ~$100

Published on 2026-02-05 22:06 ℹ️ LocalLLaMA 📰 Read the original source article →

SoproTTS v1.5: clonazione vocale zero-shot con soli 100 dollari

SoproTTS, a side project, has released version 1.5 of its text-to-speech (TTS) model. This 135M parameter model was trained for approximately $100 using a single GPU.

Performance

SoproTTS v1.5 boasts the following features:

250 ms TTFA streaming latency
RTF (Real-Time Factor) of 0.05 (approximately 20× real-time) on CPU
Zero-shot voice cloning

The model, while not perfect, represents an improvement over previous versions, offering reduced size, increased speed, and stability. The training code will be made available in the future.

For those evaluating on-premise deployments, there are trade-offs to consider. AI-RADAR offers analytical frameworks on /llm-onpremise to evaluate these aspects.

AI-Radar Takeaway

SoproTTS v1.5 is a 135M parameter TTS (text-to-speech) model offering zero-shot voice cloning. Trained for approximately $100 on a single GPU, the model achieves around 20x real-time speed on a base MacBook M3 CPU. The new v1.5 version offers reduced latency and improved stability.

🤖 Ask AI about this

Want to dive deeper? Read the full article from the source:

📖 READ THE ORIGINAL ARTICLE

💻 Need GPU Cloud Infrastructure?

For running LLM inference, training models, or testing hardware configurations, check out this platform:

⚡

RunPod GPU Cloud Platform

Flexible GPU cloud with pay-per-second billing. Deploy instantly with Docker support, auto-scaling, and a wide selection of GPU types from RTX 4090 to H100.

✓ No commitments ✓ Instant deployment ✓ Production-ready

🔗 This is an affiliate link - we may earn a commission at no extra cost to you.

AI-RADAR NEWSLETTER

Stay ahead — get AI signals in your inbox

Daily or weekly digest of the most important AI news. 160+ readers, no spam.

💬 Comments (0)

🔒 Log in or register to comment on articles.

No comments yet. Be the first to comment!

🔍 Continue Exploring

SECTION

Explore LLM On-Premise

Complete guide to running AI models locally: hardware, stack, and privacy.

Read →

LLM Feb 11

MOSS-TTS Released: Open Source Text-to-Speech

MOSS-TTS, a new open-source text-to-speech model, has been released. The news was shared via a post on Reddit, paving the way for new experiments in the field o

Read →

LLM May 02

Flare-TTS 28M: An Open Source Text-to-Speech Model Trained Locally

A new Text-to-Speech (TTS) model, Flare-TTS 28M, has been released as Open Source. Trained from scratch on a single NVIDIA A6000 GPU in approximately 24 hours,

Read →

LLM Feb 14

KaniTTS2: open-source TTS model with voice cloning, 3GB VRAM footprint

KaniTTS2 is a 400M parameter open-source text-to-speech (TTS) model designed for real-time conversational use cases. It supports voice cloning and runs with onl

Read →

LLM Jan 24

LuxTTS: Efficient voice cloning with a compact TTS model

LuxTTS, a diffusion-based text-to-speech model with only 120 million parameters, has been released. It stands out for its high-quality voice cloning capabilitie

Read →

Altro May 06

OmniVoice: One-Shot Voice Cloning and its Potential for On-Premise Deployments

A Reddit user expressed significant enthusiasm for OmniVoice, a one-shot voice cloning technology. Although not a Large Language Model, its ease of use and abil

Read →

SoproTTS v1.5: Zero-Shot Voice Cloning TTS for ~$100

Performance

💻 Need GPU Cloud Infrastructure?

Stay ahead — get AI signals in your inbox

💬 Comments (0)

🔍 Continue Exploring

More in LLM

👥 Join 160+ AI explorers