The hardware and bandwidth for this mirror is donated by dogado GmbH, the Webhosting and Full Service-Cloud Provider. Check out our Wordpress Tutorial.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]dogado.de.

Overview

The goal of text2speech is to harmonize various text-to-speech engines, including Amazon Polly, Coqui TTS, Google Cloud Text-to-Speech API, and Microsoft Cognitive Services Text to Speech REST API.

With the exception of Coqui TTS, all these engines are accessible as R packages:

You might notice Coqui TTS doesn’t have its own R package. This is because, at this time, text2speech directly incorporates the functionality of Coqui TTS. The R wrapper of Coqui is under development.

Installation

You can install this package from CRAN or the development version from GitHub with:

# Install from CRAN
install.packages("text2speech")

# or the development version from GitHub
# install.packages("devtools")
devtools::install_github("jhudsl/text2speech")

Authentication

Check for authentication. If not already authenticated, users must individually configure it for each service.

library(text2speech)

# Amazon Polly
tts_auth("amazon")
#> [1] TRUE
# Coqui TTS
tts_auth("coqui")
#> [1] TRUE
# Google Cloud Text-to-Speech API 
tts_auth("google")
#> [1] TRUE
# Microsoft Cognitive Services Text to Speech REST API
tts_auth("microsoft")
#> [1] TRUE

Voices

List different voice options for each service.

# Amazon Polly
voices_amazon <- tts_amazon_voices()
head(voices_amazon)
#>   voice         language language_code gender service
#> 1 Zeina           Arabic           arb Female  amazon
#> 2 Zhiyu Chinese Mandarin        cmn-CN Female  amazon
#> 3  Naja           Danish         da-DK Female  amazon
#> 4  Mads           Danish         da-DK   Male  amazon
#> 5 Ruben            Dutch         nl-NL   Male  amazon
#> 6 Lotte            Dutch         nl-NL Female  amazon

# Coqui TTS
voices_coqui <- tts_coqui_voices()
#> ℹ Test out different voices on the CoquiTTS Demo (<https://huggingface.co/spaces/coqui/CoquiTTS>)
head(voices_coqui)
#> # A tibble: 6 × 5
#>   type       language     dataset       model_name service
#>   <chr>      <chr>        <chr>         <chr>      <chr>  
#> 1 tts_models multilingual multi-dataset your_tts   coqui  
#> 2 tts_models multilingual multi-dataset bark       coqui  
#> 3 tts_models bg           cv            vits       coqui  
#> 4 tts_models cs           cv            vits       coqui  
#> 5 tts_models da           cv            vits       coqui  
#> 6 tts_models et           cv            vits       coqui

# Google Cloud Text-to-Speech API 
voices_google <- tts_google_voices()
head(voices_google)
#>              voice language language_code gender service
#> 1 af-ZA-Standard-A     <NA>         af-ZA FEMALE  google
#> 2 af-ZA-Standard-A     <NA>         af-ZA FEMALE  google
#> 3  ar-XA-Wavenet-C   Arabic         ar-XA   MALE  google
#> 4 ar-XA-Standard-C   Arabic         ar-XA   MALE  google
#> 5 ar-XA-Standard-D   Arabic         ar-XA FEMALE  google
#> 6  ar-XA-Wavenet-A   Arabic         ar-XA FEMALE  google

# Microsoft Cognitive Services Text to Speech REST API
voices_microsoft <- tts_microsoft_voices()
head(voices_microsoft)
#>                                                                voice
#> 1   Microsoft Server Speech Text to Speech Voice (af-ZA, AdriNeural)
#> 2 Microsoft Server Speech Text to Speech Voice (af-ZA, WillemNeural)
#> 3 Microsoft Server Speech Text to Speech Voice (am-ET, MekdesNeural)
#> 4  Microsoft Server Speech Text to Speech Voice (am-ET, AmehaNeural)
#> 5 Microsoft Server Speech Text to Speech Voice (ar-AE, FatimaNeural)
#> 6 Microsoft Server Speech Text to Speech Voice (ar-AE, HamdanNeural)
#>                        language language_code gender   service
#> 1      Afrikaans (South Africa)         af-ZA Female microsoft
#> 2      Afrikaans (South Africa)         af-ZA   Male microsoft
#> 3            Amharic (Ethiopia)         am-ET Female microsoft
#> 4            Amharic (Ethiopia)         am-ET   Male microsoft
#> 5 Arabic (United Arab Emirates)         ar-AE Female microsoft
#> 6 Arabic (United Arab Emirates)         ar-AE   Male microsoft

Convert text to speech

Synthesize speech with tts(text = "TEXT", service = "ENGINE")

# Amazon Polly
tts("Hello world!", service = "amazon")

# Coqui TTS
tts("Hello world!", service = "coqui")

# Google Cloud Text-to-Speech API 
tts("Hello world!", service = "google")

# Microsoft Cognitive Services Text to Speech REST API
tts("Hello world!", service = "microsoft")

The resulting output will consist of a standardized tibble featuring the following columns:

These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.
Health stats visible at Monitor.