MOSI.AI and the OpenMOSS team released MOSS-TTS-Nano on April 10, 2026, a text-to-speech model with only 100 million parameters that runs in real-time on CPU hardware without requiring GPU acceleration. The model supports 20 languages including Chinese, English, and multiple European and Asian languages, generating 48 kHz stereo audio output.
Pure Autoregressive Architecture Eliminates Traditional Vocoding
MOSS-TTS-Nano uses a pure autoregressive Audio Tokenizer plus LLM pipeline instead of traditional neural vocoding approaches. The MOSS-Audio-Tokenizer-Nano component compresses audio into a 12.5 Hz token stream using residual vector quantization (RVQ) with 16 codebooks, supporting variable bitrates from 0.125 to 2 kbps. This unified discrete audio interface maintains compatibility across the entire MOSS-TTS model family.
CPU-Only Deployment Targets Edge and Budget-Constrained Applications
The model's 0.1B parameter count enables deployment scenarios where GPU access is unavailable or cost-prohibitive. MOSS-TTS-Nano supports multiple deployment interfaces including Python scripts, FastAPI web applications, and command-line tools. Real-time streaming inference with low latency makes it suitable for local demonstrations, web serving, and lightweight product integration.
Voice Cloning Through Reference Audio Samples
MOSS-TTS-Nano includes voice cloning capabilities through reference audio samples, with automatic chunked processing for long-form text synthesis. Users can provide sample audio to clone specific voice characteristics without additional training. The model's architecture maintains voice consistency across extended generation tasks through its chunking mechanism.
Key Takeaways
- MOSS-TTS-Nano contains only 100 million parameters and runs real-time speech synthesis on CPU without GPU requirements
- The model supports 20 languages and outputs 48 kHz stereo audio using a pure autoregressive pipeline
- Voice cloning functionality works through reference audio samples with automatic chunking for long-form text
- Multiple deployment options include Python APIs, FastAPI web apps, and CLI tools for different integration scenarios
- The GitHub repository reached 208 stars within days of the April 10, 2026 release