Blog

Voxygen TTS Cloud

Simplify your life and use Voxygen speech synthesis in SaaS mode: with Voxygen Cloud, you don't have to do any integration.

Banque

TTS Solution description

Using text-to-speech in SaaS mode

Voxygen Cloud is a SaaS service available 24/7 enabling you to design TTS voice applications quickly and easily. No software integration is required; you simply use our API to send requests containing the text to be vocalised. Voxygen Cloud then streams the audio output back to you to be played instantly and in real time by your application. You can also use Voxygen Cloud to produce your voice content autonomously by retrieving the audio via a URL link. Voxygen Cloud is the SaaS solution for simply deploying your automated voice applications, whatever your use case: voice assistant, publication of voice content, information or alert messages, e-learning applications, and many more.

Depositphotos_636873454_XL Moyenne

Features

Nuage_orange_petit

Secure access using login and password

Nuage_rose_petit

Streaming with low latency

Nuage_violet_petit

Downloading generated audio files

Nuage_bleu_F_petit

Language and voice selection

Nuage_bleu_C_petit

Controlling pauses, speech rate, intonation and timbre

Nuage_vert

Customized lexicons

Nuage_orange_petit

Adding background music

Nuage_rose_petit

Synchronisation information for video animation

Why Voxygen TTS

Easy integration, security and robustness, customisation

Nuage_rose-Feb-29-2024-05-06-46-3690-PM

Easy integration

We make integrating our TTS solutions as easy as possible. With standardised APIs and user-friendly interfaces, our TTS technology integrates easily with your existing platforms and applications. You can deploy text-to-speech in your system quickly and efficiently, adding a new dimension to your customer communications and interactions.

Nuage_violet-Feb-29-2024-03-51-12-3433-PM

Security and robustness

Voxygen provides you with a secure account using a unique identifier and password. Our infrastructure is hosted on a European sovereign cloud. We undertake not to store your interaction data unless you ask us to do so for support purposes. Our infrastructure is high-availability, guaranteeing you permanent access to our service.

Nuage_bleu-Feb-29-2024-03-52-23-5042-PM

Customisation

You can personalise your text-to-speech by associating SSML parameters with the voices to adapt the audio rendering and lexicons for correct pronunciation of your business terms. You can synchronise the audio with your videos by retrieving events linked to the text: sentence start/end marks and words.

French Railways

Read success story

"Voxygen offers us reliable, customised solutions to cover all our needs".

Capture d’écran 2023-11-17 à 14.54.35

Jean Philippe CHANTECAILLE

SNCF Audio Announcements Project Manager Brand Identity and Design

BNP Paribas' brand voices deployed on IVR

Read success story

"Voxygen offers a complete range of solutions for operating our Telmi and HelloïZ branded voices on all our communication channels.”

photo moi valide

Marie Marquet

BNP Contact Administrator - Deployment Organiser 

 

"The collaboration with Voxygen during this voice creation project was efficient, constructive and friendly"

Elsa Sibileau-Verdon

Marketing & Communication

Brand and Media

Integration

API Description

Input text format

Audio output

API REST

The Voxygen Cloud API is a REST API that allows a client application to send an HTTPS request containing all the information required for vocalisation (text to be vocalised, voice, audio format, etc.). HTTPS GET or POST requests are processed instantly and the audio produced can be played immediately by the client application.

Technical documentation

URL and user account

A URL specifies the address of the Voxygen Cloud API.

To access Voxygen Cloud you need a user account defined by a login and a password.

Text formats

  • Plain texte encoded in UTF-8
  • SSML document (versions 1.0 and 1.1)

 

Lexicons

  • PLS format version 1.0

Sortie audio

  • Sampling frequency from 6 kHz to 48 kHz
  • Formats

           - PCM (RAW, WAV et AU) 16-bit linar or G.711 (A-lax, μ-law)

           - MP3 : 16, 31, 64, 96, 128, ou 160 kbit/s bitrates; quality from 0 to 9

           - OGG : quality from 0.0 to 1.0

Synchronisation events

  • Visemes
  • Words

Convert text into speech instantly!

Discover our cutting-edge TTS solution, perfectly tailored to your needs and easy to integrate.

Customisable

Reliable

Scalable

Design_sans_titre__7_-removebg-preview