We have hosted the application metavoice 1b in order to run this application in our online workstations with Wine or directly.


Quick description about metavoice 1b:

MetaVoice � in the form of its source repository �metavoice-src� � is a large-scale text-to-speech (TTS) model. Specifically, the base model (MetaVoice-1B) uses around 1.2 billion parameters and has been trained on a massive dataset � reportedly around 100,000 hours of speech data. The goal is to provide human-like, expressive, and flexible TTS: able to generate natural-sounding speech that can handle diverse inputs and likely generalize over voice styles, intonation, prosody, and perhaps multiple languages or accents. With that scale and dataset volume, MetaVoice aims to push the boundary of what open-source TTS models can achieve: high fidelity, natural prosody, and robustness even for edge cases. As a foundational model, it can serve as the backbone for downstream tasks � such as voice generation, voice cloning, speech generation for virtual agents, or even audio production pipelines.

Features:
  • Large-scale TTS model (~1.2 B parameters) trained on 100 k hours of speech
  • High-fidelity, expressive and human-like speech output
  • Foundation model for downstream voice tasks (voice cloning, voice generation, TTS services)
  • Likely supports multiple languages, styles or speaker variations (given scale)
  • Codebase includes training and inference infrastructure for customization
  • Potential for fine-tuning on new voice data or adaptation to specific voice profiles


Programming Language: Python.
Categories:
Text to Speech

Page navigation:

©2024. Winfy. All Rights Reserved.

By OD Group OU – Registry code: 1609791 -VAT number: EE102345621.