Neutts-air – open-source, on device TTS

https://news.ycombinator.com/rss Hits: 12
Summary

NeuTTS Air ☁️ HuggingFace 🤗: Model, Q8 GGUF, Q4 GGUF Spaces neutts-demo.mp4 Created by Neuphonic - building faster, smaller, on-device voice AI State-of-the-art Voice AI has been locked behind web APIs for too long. NeuTTS Air is the world’s first super-realistic, on-device, TTS speech language model with instant voice cloning. Built off a 0.5B LLM backbone, NeuTTS Air brings natural-sounding speech, real-time performance, built-in security and speaker cloning to your local device - unlocking a new category of embedded voice agents, assistants, toys, and compliance-safe apps. Key Features 🗣Best-in-class realism for its size - produces natural, ultra-realistic voices that sound human 📱Optimised for on-device deployment - provided in GGML format, ready to run on phones, laptops, or even Raspberry Pis 👫Instant voice cloning - create your own speaker with as little as 3 seconds of audio 🚄Simple LM + codec architecture built off a 0.5B backbone - the sweet spot between speed, size, and quality for real-world applications Model Details NeuTTS Air is built off Qwen 0.5B - a lightweight yet capable language model optimised for text understanding and generation - as well as a powerful combination of technologies designed for efficiency and quality: Supported Languages : English : English Audio Codec : NeuCodec - our 50hz neural audio codec that achieves exceptional audio quality at low bitrates using a single codebook : NeuCodec - our 50hz neural audio codec that achieves exceptional audio quality at low bitrates using a single codebook Context Window : 2048 tokens, enough for processing ~30 seconds of audio (including prompt duration) : 2048 tokens, enough for processing ~30 seconds of audio (including prompt duration) Format : Available in GGML format for efficient on-device inference : Available in GGML format for efficient on-device inference Responsibility : Watermarked outputs : Watermarked outputs Inference Speed : Real-time generation on mid-range devices : Real-time...

First seen: 2025-10-09 21:21

Last seen: 2025-10-10 08:23