Skip to content

Latest commit

 

History

History
1268 lines (944 loc) · 90.4 KB

File metadata and controls

1268 lines (944 loc) · 90.4 KB
author ms.service ms.topic ms.date ms.author ms.custom
eric-urban
azure-ai-speech
include
2/18/2025
eur
references_regions

February 2025 release

Version 1.1 of HD voices (Public preview)

Updated 13 current HD voices to the version 1.1 (latest) to support multilingual voices.

  • Specifying the "latest" model defaults to v1.1
  • v1.1 supports multilingual
  • If you want to call older version, specify the version in the voice name:
    • Version 1.0: en-US-Ava:DragonHDV1.0Neural
    • Latest (currently V1.1): en-US-Ava:DragonHDLatestNeural
Locale (BCP-47) Voice name
de-DE de-DE-Seraphina:DragonHDLatestNeural (Female)
en-US en-US-Brian:DragonHDLatestNeural (Male)
en-US en-US-Davis:DragonHDLatestNeural (Male)
en-US en-US-Ava:DragonHDLatestNeural (Female)
en-US en-US-Andrew:DragonHDLatestNeural (Male)
en-US en-US-Andrew2:DragonHDLatestNeural (Male) - optimized for free-talking
en-US en-US-Emma:DragonHDLatestNeural (Female)
en-US en-US-Emma2:DragonHDLatestNeural (Female) - optimized for free-talking
en-US en-US-Steffan:DragonHDLatestNeural (Male)
en-US en-US-Aria:DragonHDLatestNeural (Female)
en-US en-US-Jenny:DragonHDLatestNeural (Female)
ja-JP ja-JP-Masaru:DragonHDLatestNeural (Male)
zh-CN zh-CN-Xiaochen:DragonHDLatestNeural (Female)

Public preview of new HD voices

Added 14 more HD voices that are only available in version 1.1.

Locale (BCP-47) Voice name
de-DE de-DE-Florian:DragonHDLatestNeural (Male)
en-US en-US-Adam:DragonHDLatestNeural (Male)
en-US en-US-Brain:DragonHDLatestNeural (Male)
en-US en-US-Davis:DragonHDLatestNeural (Male)
en-US en-US-Phoebe:DragonHDLatestNeural (Female)
en-US en-US-Serena:DragonHDLatestNeural (Female)
en-US en-US-Alloy:DragonHDLatestNeural (Male)
en-US en-US-Nova:DragonHDLatestNeural (Female)
es-ES es-ES-Ximena:DragonHDLatestNeural (Female)
es-ES es-ES-Tristan:DragonHDLatestNeural (Male)
fr-FR fr-FR-Vivienne:DragonHDLatestNeural (Female)
fr-FR fr-FR-Remy:DragonHDLatestNeural (Male)
ja-JP ja-JP-Nanami:DragonHDLatestNeural (Female)
zh-CN zh-CN-Yunfan:DragonHDLatestNeural (Male)

Azure OpenAI Service turbo voices

These turbo voices are now generally available:

Locale (BCP-47) Voice name
en-US en-US-AlloyTurboMultilingualNeural (Male)
en-US en-US-EchoTurboMultilingualNeural (Male)
en-US en-US-FableTurboMultilingualNeural (Neutral)
en-US en-US-NovaTurboMultilingualNeural (Female)
en-US en-US-OnyxTurboMultilingualNeural (Male)
en-US en-US-ShimmerTurboMultilingualNeural (Female)

Voice quality improvements

Improved the quality of voices below with latest recipe

Locale (BCP-47) Voice name
ar-EG ar-EG-ShakirNeural (Male)
bg-BG bg-BG-KalinaNeural (Female)
ca-ES ca-ES-EnricNeural (Male)
ca-ES ca-ES-JoanaNeural (Female)
da-DK da-DK-JeppeNeural (Male)
el-GR el-GR-NestorasNeural (Male)
en-IE en-IE-EmilyNeural (Female)
fi-FI fi-FI-HarriNeural (Male)
fi-FI fi-FI-SelmaNeural (Female)
fr-CH fr-CH-FabriceNeural (Female)
fr-CH fr-CH-ArianeNeural (Female)
he-IL he-IL-HilaNeural (Female)
he-IL he-IL-AvriNeural (Male)
hr-HR hr-HR-GabrijelaNeural (Female)
id-ID id-ID-ArdiNeural (Male)
ms-MY ms-MY-YasminNeural (Female)
nb-NO nb-NO-PernilleNeural (Female)
nb-NO nb-NO-FinnNeural (Male)
nl-NL nl-NL-MaartenNeural (Male)
pt-PT pt-PT-RaquelNeural (Female)
ro-RO ro-RO-AlinaNeural (Female)
ro-RO ro-RO-EmilNeural (Male)
ru-RU ru-RU-SvetlanaNeural (Female)
sv-SE sv-SE-MattiasNeural (Male)
sv-SE sv-SE-SofieNeural (Female)
vi-VN vi-VN-HoaiMyNeural (Female)
vi-VN vi-VN-NamMinhNeural (Male)
zh-HK zh-HK-HiuMaanNeural (Female)
zh-HK zh-HK-WanLungNeural (Male)

GA - Multi-style embedded Jenny

Added style support for en-US-JennyNeural in embedded speech. The same styles are supported as in the cloud. The following styles are supported: angry, assistant, chat, cheerful, customerservice, excited, friendly, hopeful, newscast, sad, shouting, terrified, unfriendly, and whispering.

January 2025 release

Custom avatar training

You can now train custom avatars in Speech Studio. Previously, you had to wait for Microsoft to train your custom avatar.

For more details about how to create a custom avatar, see create a custom text to speech avatar.

October 2024 release

Prebuilt neural voice

Introduced 4 turbo version of Azure OpenAI voices in public preview: en-US-EchoTurboMultilingualNeural, en-US-FableTurboMultilingualNeural, en-US-OnyxTurboMultilingualNeural, and en-US-ShimmerTurboMultilingualNeural. Turbo version of Azure OpenAI voices has the similar voice persona as Azure OpenAI voices but supports extra features. Turbo voices support the full set of SSML elements and more features like word boundary, just like other Azure AI Speech voices. See the full language and voice list for more information.

These voices are now generally available:

Locale (BCP-47) Voice name
de-DE SeraphinaMultilingualNeural
de-DE FlorianMultilingualNeural
en-GB AdaMultilingualNeural
en-GB OllieMultilingualNeural
en-US LunaNeural
en-US KaiNeural
en-US CoraMultilingualNeural
en-US ChristopherMultilingualNeural
en-US BrandonMultilingualNeural
es-ES IsidoraMultilingualNeural
es-ES ArabellaMultilingualNeural
es-ES TristanMultilingualNeural
es-ES XimenaMultilingualNeural
fr-FR LucienMultilingualNeural
fr-FR VivienneMultilingualNeural
fr-FR RemyMultilingualNeural
it-IT IsabellaMultilingualNeural
it-IT MarcelloMultilingualNeural
it-IT AlessioMultilingualNeural
it-IT GiuseppeMultilingualNeural
ko-KR HyunsuMultilingualNeural
pt-BR ThalitaMultilingualNeural
pt-BR MacerioMultilingualNeural

Prebuilt high definition (HD) neural voice

Azure AI speech high definition (HD) voices are available in public preview. The HD voices can understand the content, automatically detect emotions in the input text, and adjust the speaking tone in real-time to match the sentiment. HD voices maintain a consistent voice persona from their neural (and non HD) counterparts, and deliver even more value through enhanced features. For more information, see What are Azure AI Speech high definition (HD) voices?.

Custom neural voice

  • Previously, some locales were only supported with V3 for the training recipe. These locales now also support V9, enabling improved training quality and expanded features. For these locales, refer to the following table:

    Locale (BCP-47) Language
    ar-EG Arabic (Egypt)
    ar-SA Arabic (Saudi Arabia)
    ca-ES Catalan
    cs-CZ Czech (Czechia)
    da-DK Danish (Denmark)
    de-AT German (Austria)
    de-CH German (Switzerland)
    el-GR Greek (Greece)
    en-IN English (India)
    fi-FI Finnish (Finland)
    fr-CH French (Switzerland)
    he-IL Hebrew (Israel)
    hi-IN Hindi (India)
    hu-HU Hungarian (Hungary)
    ms-MY Malay (Malaysia)
    nb-NO Norwegian Bokmål (Norway)
    nl-NL Dutch (Netherlands)
    pl-PL Polish (Poland)
    pt-PT Portuguese (Portugal)
    ro-RO Romanian (Romania)
    ru-RU Russian (Russia)
    sk-SK Slovak (Slovakia)
    sv-SE Swedish (Sweden)
    th-TH Thai (Thailand)
    r-TR Turkish (Türkiye)
    vi-VN Vietnamese (Vietnam)
    zh-HK Chinese (Cantonese, Traditional)
    zh-TW Chinese (Taiwanese Mandarin, Traditional)
  • Custom neural voice Pro now supports the following new locales:

    • en-NZ: English (New Zealand)
    • es-CL: Spanish (Chile)
    • es-US: Spanish (United States)
    • ta-MY: Tamil (Malaysia)

    See the language list for Custom neural voice for the full list of supported locales.

  • The cross-lingual feature now supports the following new locales as source locales:

    Locale (BCP-47) Language
    da-DK Danish (Denmark)
    de-AT German (Austria)
    de-CH German (Switzerland)
    de-DE German (Germany)
    en-CA English (Canada)
    fi-FI Finnish (Finland)
    fr-CH French (Switzerland)
    hu-HU Hungarian (Hungary)
    ms-MY Malay (Malaysia)
    nb-NO Norwegian Bokmål (Norway)
    pt-PT Portuguese (Portugal)
    sv-SE Swedish (Sweden)
    tr-TR Turkish (Türkiye)
    ta-IN Tamil (India)
    zh-HK Chinese (Cantonese, Traditional)

    See the language list for Custom neural voice for the full list of supported locales.

  • The multi-style voice feature now supports the following new locales:

    Locale (BCP-47) Language
    ar-EG Arabic (Egypt)
    ar-SA Arabic (Saudi Arabia)
    ca-ES Catalan
    cs-CZ Czech (Czechia)
    da-DK Danish (Denmark)
    de-AT German (Austria)
    de-CH German (Switzerland)
    de-DE German (Germany)
    el-GR Greek (Greece)
    en-AU English (Australia)
    en-CA English (Canada)
    en-GB English (United Kingdom)
    en-IN English (India)
    es-ES Spanish (Spain)
    es-MX Spanish (Mexico)
    fi-FI Finnish (Finland)
    fr-CA French (Canada)
    fr-CH French (Switzerland)
    fr-FR French (France)
    he-IL Hebrew (Israel)
    hi-IN Hindi (India)
    hu-HU Hungarian (Hungary)
    it-IT Italian (Italy)
    ko-KR Korean (Korea)
    ms-MY Malay (Malaysia)
    nb-NO Norwegian Bokmål (Norway)
    nl-BE Dutch (Belgium)
    nl-NL Dutch (Netherlands)
    pl-PL Polish (Poland)
    pt-BR Portuguese (Brazil)
    pt-PT Portuguese (Portugal)
    ro-RO Romanian (Romania)
    ru-RU Russian (Russia)
    sk-SK Slovak (Slovakia)
    sv-SE Swedish (Sweden)
    th-TH Thai (Thailand)
    tr-TR Turkish (Türkiye)
    vi-VN Vietnamese (Vietnam)
    zh-HK Chinese (Cantonese, Traditional)
    zh-TW Chinese (Taiwanese Mandarin, Traditional)

    See the language list for Custom neural voice for the full list of supported locales.

September 2024 release

Prebuilt neural voice

Added support and general availability for new voices in the following locales:

Locale (BCP-47) Language Text to speech voices
as-IN Assamese (India) as-IN-YashicaNeural (Female)
as-IN-PriyomNeural (Male)
or-IN Odia (India) or-IN-SubhasiniNeural (Female)
or-IN-SukantNeural (Male)
pa-IN Punjabi (India) pa-IN-OjasNeural (Male)
pa-IN-VaaniNeural (Female)

The one voice in this table is generally available and supports only the 'en-IN' locale.

Locale (BCP-47) Language Text to speech voices
en-IN English (India) en-IN-AashiNeural (Female)

The five voices in this table are generally available and support both "en-IN" and "hi-IN" locales.

Locale (BCP-47) Language Text to speech voices
en-IN English (India) en-IN-AaravNeural (Male)
en-IN-AnanyaNeural (Female)
en-IN-KavyaNeural (Female)
en-IN-KunalNeural (Male)
en-IN-RehaanNeural (Male)
hi-IN Hindi (India) hi-IN-AaravNeural (Male)
hi-IN-AnanyaNeural (Female)
hi-IN-KavyaNeural (Female)
hi-IN-KunalNeural (Male)
hi-IN-RehaanNeural (Male)

Voice styles and roles

Added newscast, cheerful, empathetic styles support for the en-IN-NeerjaNeural and hi-IN-SwaraNeural voices.

Added new styles for the following voices:

  • es-MX-DaliaNeural: whispering, sad, cheerful
  • fr-FR-DeniseNeural: whispering, sad, excited
  • it-IT-IsabellaNeural: whispering, sad, excited, cheerful
  • pt-PT-RaquelNeural: whispering, sad
  • de-DE-ConradNeural: sad, cheerful
  • en-GB-RyanNeural: whispering, sad
  • es-MX-JorgeNeural: whispering, sad, excited, cheerful
  • fr-FR-HenriNeural: whispering, sad, excited
  • it-IT-DiegoNeural: sad, excited, cheerful
  • es-ES-AlvaroNeural: cheerful, sad
  • ko-KR-InjoonNeural: sad

See the Voice styles and roles for more information.

August 2024 release

Prebuilt neural voice

  • Introduce new multilingual voices in public preview. See the full language and voice list for more information.

    Brand new multilingual voices

    Locale Language Gender Voice name
    en-US English (United States) Male en-US-AdamMultilingualNeural
    en-US English (United States) Female en-US-AmandaMultilingualNeural
    en-US English (United States) Male en-US-DerekMultilingualNeural
    en-US English (United States) Male en-US-LewisMultilingualNeural
    en-US English (United States) Female en-US-LolaMultilingualNeural
    en-US English (United States) Female en-US-PhoebeMultilingualNeural
    en-US English (United States) Male en-US-SamuelMultilingualNeural
    en-US English (United States) Female en-US-SerenaMultilingualNeural
    en-US English (United States) Male en-US-DustinMultilingualNeural
    en-US English (United States) Female en-US-EvelynMultilingualNeural
    es-ES Spanish (Spain) Male es-ES-TristanMultilingualNeural
    fr-FR French (France) Male fr-FR-LucienMultilingualNeural
    pt-BR Portuguese (Brazil) Male pt-BR-MacerioMultilingualNeural
    zh-CN Chinese (Mandarin, Simplified) Male zh-CN-YunfanMultilingualNeural
    zh-CN Chinese (Mandarin, Simplified) Male zh-CN-YunxiaoMultilingualNeural
    zh-CN Chinese (Mandarin, Simplified) Male zh-CN-YunyiMultilingualNeural

    Monolingual models updated to multilingual voices with improvements in naturalness

    Locale Language Gender Voice name
    en-US English (United States) Female en-US-NancyMultilingualNeural
    en-US English (United States) Male en-US-BrandonMultilingualNeural
    en-US English (United States) Male en-US-ChristopherMultilingualNeural
    en-US English (United States) Female en-US-CoraMultilingualNeural
    en-US English (United States) Male en-US-DavisMultilingualNeural
    en-US English (United States) Male en-US-SteffanMultilingualNeural
    es-ES Spanish (Spain) Female es-ES-XimenaMultilingualNeural
    it-IT Italian (Italy) Male it-IT-GiuseppeMultilingualNeural
    ko-KR Korean (Korea) Male ko-KR-HyunsuMultilingualNeural
  • Enhance the following current multilingual voices with better quality.

    Locale Language Gender Voice name
    en-US English (United States) Male en-US-AndrewMultilingualNeural
    en-US English (United States) Female en-US-AvaMultilingualNeural
  • Three multilingual voices now support styles. See the Voice styles and roles for more information.

    • en-US-SerenaMultilingualNeural: empathetic, excited, friendly, shy, serious, relieved, and sad.
    • en-US-AndrewMultilingualNeural: empathetic and relieved.
    • zh-CN-XiaoxiaoMultilingualNeural: affectionate, cheerful, empathetic, excited, poetry-reading, sorry, and story.

July 2024 release

Text to speech avatar (GA)

Text to speech avatar is now generally available. For more information, see text to speech avatar.

Prebuilt neural voice

  • Introduce 2 turbo version of Azure OpenAI voices in public preview: en-US-AlloyTurboMultilingualNeural and en-US-NovaTurboMultilingualNeural. Turbo version of Azure OpenAI voices has the similar voice persona as Azure OpenAI voices but supports extra features. Turbo voices support the full set of SSML elements and more features like word boundary, just like other Azure AI Speech voices. See the full language and voice list for more information.

  • Introduce 2 new multilingual voices in public preview: zh-CN-YunfanMultilingualNeural and zh-CN-YunxiaoMultilingualNeural. See the full language and voice list for more information.

Embedded neural voice

  • en-US-JennyMultilingual voice is released in production, supporting up to 24 locales for on-device experience. For the supported locales, see the table below.

    Locale Language
    da-DK Danish (Denmark)
    de-DE German (Germany)
    en-AU English (Australia)
    en-GB English (United Kingdom)
    en-IN English (India)
    en-US English (United States)
    es-ES Spanish (Spain)
    es-MX Spanish (Mexico)
    fr-CA French (Canada)
    fr-FR French (France)
    he-IL Hebrew (Israel)
    it-IT Italian (Italy)
    ja-JP Japanese (Japan)
    ko-KR Korean (Korea)
    nb-NO Norwegian Bokmål (Norway)
    nl-NL Dutch (Netherlands)
    pl-PL Polish (Poland)
    pt-PT Portuguese (Portugal)
    sv-SE Swedish (Sweden)
    th-TH Thai (Thailand)
    tr-TR Turkish (Türkiye)
    zh-CN Chinese (Mandarin, Simplified)
    zh-HK Chinese (Cantonese, Traditional)
    zh-TW Chinese (Taiwanese Mandarin, Traditional)

June 2024 release

Prebuilt neural voice

  • Introducing 6 new voices in public preview available in specific regions: East Asia, Southeast Asia, East US, West US, and Central India.

    Locale Language Text to speech voices
    or-IN Odia (India) or-IN-SubhasiniNeural (Female)
    or-IN Odia (India) or-IN-SukantNeural (Male)
    pa-IN Punjabi (India) pa-IN-VaaniNeural (Female)
    pa-IN Punjabi (India) pa-IN-OjasNeural (Male)
    as-IN Assamese (India) as-IN-YashicaNeural (Female)
    as-IN Assamese (India) as-IN-PriyomNeural (Male)

    See the full language and voice list for more information.

Text to speech avatar

  • Text to speech avatar now supports the following regions: Southeast Asia, North Europe, West Europe, Sweden Central, South Central US, and West US 2. For more information, see Speech service regions.

May 2024 release

Personal voice (GA)

Personal voice is now generally available. With personal voice, you can get AI generated replication of your voice (or users of your application) in a few seconds. You provide a one-minute speech sample as the audio prompt, and then use it to generate speech in any of the more than 90 languages supported across more than 100 locales. For more information, see the personal voice overview.

Prebuilt neural voice

  • Introduce 8 new multilingual voices in public preview: en-GB-AdaMultilingualNeural, en-GB-OllieMultilingualNeural, es-ES-ArabellaMultilingualNeural, es-ES-IsidoraMultilingualNeural, it-IT-AlessioMultilingualNeural, it-IT-IsabellaMultilingualNeural, it-IT-MarcelloMultilingualNeural, and pt-BR-ThalitaMultilingualNeural. See the full language and voice list for more information.

  • Introduce 2 new en-US voices optimized for Call Center scenario in public preview: en-US-LunaNeural and en-US-KaiNeural. See the full language and voice list for more information.

April 2024 release

Text to speech avatar

  • You can now set a static background image for your avatars. To utilize this feature, simply use the avatarConfig.backgroundImage property and specify a URL pointing to the desired image. For details, refer to How to edit the background.

March 2024 release

Prebuilt neural voice

  • 9 multilingual voices are generally available in all regions: en-US-AvaMultilingualNeural, en-US-AndrewMultilingualNeural, en-US-EmmaMultilingualNeural, en-US-BrianMultilingualNeural, de-DE-FlorianMultilingualNeural, de-DE-SeraphinaMultilingualNeural, fr-FR-RemyMultilingualNeural, fr-FR-VivienneMultilingualNeural, and zh-CN-XiaoxiaoMultilingualNeural. See the full language and voice list for more information.

  • Introducing a new multilingual voice for public preview: ja-JP-MasaruMultilingualNeural. See the full language and voice list for more information.

  • Additional updates:

    • en-US-RyanMultilingualNeural is generally available in all regions.
    • en-US-JennyMultilingualV2Neural is generally available in all regions, merged with en-US-JennyMultilingualNeural.
    • Preview available for the updated en-IN-NeerjaNeural and hi-IN-SwaraNeural with 3 new styles in East US, West Europe, and Southeast Asia.
    • Preview available for new female voices in Central India: en-IN-KavyaNeural, en-IN-AnanyaNeural, en-IN-AashiNeural, hi-IN-KavyaNeural, and hi-IN-AnanyaNeural.

Text to speech avatar

February 2024 release

OpenAI voices

  • The Azure AI Speech service supports OpenAI text to speech voices in the following regions: North Central US and Sweden Central. Like Azure AI Speech voices, OpenAI text to speech voices deliver high-quality speech synthesis to convert written text into natural sounding spoken audio. This unlocks a wide range of possibilities for immersive and interactive user experiences. For more information, see What are OpenAI text to speech voices?.

    [!NOTE] OpenAI text to speech voices are also available in Azure OpenAI Service.

  • With this update, we have adjusted the pricing of prebuilt neural voices with Azure AI Speech. Check the updated pricing here.

Personal voice

The personal voice feature now supports DragonLatestNeural and PhoenixLatestNeural models. These new models enhance the naturalness of synthesized voices, better resembling the speech characteristics of the voice in the prompt. For more details, refer to Integrate personal voice in your application.

December 2023 release

Custom voice API

The custom voice API is available for creating and managing professional and personal custom neural voice models.

Custom neural voice

The newly trained voice models now support 48 kHz sample rate, irrespective of the model version. For previously trained voice models, it's necessary to upgrade the engine version to at least 2023.11.13.0 version to enhance the sample rate to 48 kHz.

Prebuilt neural voice

  • Introducing new multilingual voices for public preview:
Locale (BCP-47) Language Text to speech voices
de-DE German (Germany) de-DE-FlorianMultilingualNeural (Male)
de-DE German (Germany) de-DE-SeraphinaMultilingualNeural (Female)
en-US English (United States) en-US-AvaMultilingualNeural (Female)
en-US English (United States) en-US-EmmaMultilingualNeural (Female)
fr-FR French (France) fr-FR-RemyMultilingualNeural (Male)
en-US English (United States) en-US-BrianMultilingualNeural (Male)
en-US English (United States) en-US-AndrewMultilingualNeural (Male)
fr-FR French (France) fr-FR-VivienneMultilingualNeural (Female)
zh-CN Chinese (Mandarin, Simplified) zh-CN-XiaoxiaoMultilingualNeural (Female)
zh-CN Chinese (Mandarin, Simplified) zh-CN-XiaochenMultilingualNeural (Female)
zh-CN Chinese (Mandarin, Simplified) zh-CN-YunyiMultilingualNeural (Male)
  • Introducing new zh-CN-XiaoxiaoDialectsNeural voices for public preview which support several Chinese dialects and accents:
Voicename Secondary language Dialect/Accent
zh-CN-XiaoxiaoDialectsNeural zh-CN-shaanxi Chinese (Zhongyuan Mandarin Shaanxi, Simplified)
zh-CN-sichuan Chinese (Southwestern Mandarin, Simplified)
zh-CN-shanxi Chinese (Shanxi Accent Mandarin, Simplified)
nan-CN Chinese (Southern Min, Simplified)
zh-CN-anhui Chinese (Jianghuai Mandarin Anhui, Simplified)
zh-CN-hunan Chinese (Hunan Accent Mandarin, Simplified)
zh-CN-gansu Chinese (Lanyin Mandarin Gansu, Simplified)
zh-CN-shandong Chinese (Jilu Mandarin, Simplified)
zh-CN-henan Chinese (Zhongyuan Mandarin Henan, Simplified)
zh-CN-liaoning Chinese (Northeastern Mandarin, Simplified)
zh-TW Chinese (Taiwanese Mandarin, Traditional)

November 2023 release

Personal voice

Personal voice is available in preview in the following regions: West Europe, East US, and South East Asia. With personal voice (preview), you can get AI generated replication of your voice (or users of your application) in a few seconds. You provide a one-minute speech sample as the audio prompt, and then use it to generate speech in any of the more than 90 languages supported across more than 100 locales.

For more information, see personal voice.

Text to speech avatar

Text to speech avatar is available in preview in the following regions: West US 2, West Europe, and Southeast Asia.

Text to speech avatar converts text into a digital video of a photorealistic human (either a prebuilt avatar or a custom text to speech avatar) speaking with a natural-sounding voice. The text to speech avatar video can be synthesized asynchronously or in real time. Developers can build applications integrated with text to speech avatar through an API, or use a content creation tool on Speech Studio to create video content without coding.

For more information, see text to speech avatar, transparency notes, and disclosure for voice and avatar talent.

Custom neural voice

Added support for the 24 new locales for cross-lingual voice. See the full language list for more information.

Prebuilt neural voice

Introducing new voices for public preview:

Locale (BCP-47) Language Text to speech voices
de-DE German (Germany) SeraphinaNeural (Female)
es-ES Spanish (Spain) XimenaNeural (Female)
fr-CA French (Canada) ThierryNeural (Male)
fr-FR French (France) VivienneNeural (Female)
it-IT Italian (Italy) GiuseppeNeural (Male)
ko-KR Korean (Korea) HyunsuNeural (Male)
pt-BR Portuguese (Brazil) ThalitaNeural (Female)

Models updated with bugs fixed and quality improvement:

Locale (BCP-47) Language Text to speech voices
es-ES Spanish (Spain) AlvaroNeural (Male)
en-GB English (United Kingdom) RyanNeural (Male)
ko-KR Korean (Korea) InjoonNeural (Male)

See the full language and voice list for more information.

October 2023 release

Custom neural voice

  • Added support for the 12 new locales with custom neural voice Pro. See the full language list for more information.

September 2023 release

Prebuilt neural voice

  • Introducing new voices for public preview:
Locale (BCP-47) Language Text to speech voices
en-US English (United States) en-US-EmmaNeural (Female)
en-US English (United States) en-US-AndrewNeural (Male)
en-US English (United States) en-US-BrianNeural (Male)

See the full language and voice list for more information.

Embedded neural voice

  • All 147 locales here (except fa-IR, Persian (Iran)) are available out of box with either 1 selected female and/or 1 selected male voices.

August 2023 release

Custom neural voice

  • The latest CNV Lite training recipe version has been released now. This release brings several enhancements on the quality of your language models. Try out Speech Studio.

July 2023 release

Custom neural voice

Prebuilt Neural TTS Voices

Introducing new en-US gender neutral voice for public preview:

Locale (BCP-47) Language Text to speech voices
en-US English (United States) en-US-BlueNeural (Neutral)

Introducing new multilingual voices for public preview:

Locale (BCP-47) Language Text to speech voices
en-US English (United States) en-US-JennyMultilingualV2Neural (Female)
en-US English (United States) en-US-RyanMultilingualNeural (Male)

The multilingual voices en-US-JennyMultilingualV2Neural and en-US-RyanMultilingualNeural auto-detect the language of the input text. However, you can still use the <lang> element to adjust the speaking language for these voices.

These new multilingual voices can speak in 41 languages and accents: Arabic (Egypt), Arabic (Saudi Arabia), Catalan, Czech (Czechia), Danish (Denmark), German (Austria), German (Switzerland), German (Germany), English (Australia), English (Canada), English (United Kingdom), English (Hong Kong SAR), English (Ireland), English (India), English (United States), Spanish (Spain), Spanish (Mexico), Finnish (Finland), French (Belgium), French (Canada), French (Switzerland), French (France), Hindi (India), Hungarian (Hungary), Indonesian (Indonesia), Italian (Italy), Japanese (Japan), Korean (Korea), Norwegian Bokmål (Norway), Dutch (Belgium), Dutch (Netherlands), Polish (Poland), Portuguese (Brazil), Portuguese (Portugal), Russian (Russia), Swedish (Sweden), Thai (Thailand), Turkish (Türkiye), Chinese (Mandarin, Simplified), Chinese (Cantonese, Traditional), Chinese (Taiwanese Mandarin, Traditional).

These multilingual voices don't fully support certain SSML elements, such as break, emphasis, silence, and sub.

Important

The en-US-JennyMultilingualV2Neural voice is provided temporarily in public preview solely for evaluation purposes. It will be removed in the future.

In order to speak in a language other than English, the current implementation of the en-US-JennyMultilingualNeural voice requires that you set the <lang xml:lang> element. We anticipate that during Q4 calendar year 2023, the en-US-JennyMultilingualNeural voice will be updated to speak in the language of the input text without the <lang xml:lang> element. This will be in parity with the en-US-JennyMultilingualV2Neural voice.

Introducing new features in public preview for below voices:

  • Added Latin input for Serbian (Serbia) sr-RS voices: sr-latn-RS-SophieNeural and sr-latn-RS-NicholasNeural.
  • Added English pronunciation support for Albanian (Albania) sq-AL voices: sq-AL-AnilaNeural and sq-AL-IlirNeural.

May 2023 release

Audio Content Creation

  • All prebuilt voices with speaking styles and multi-style custom voices support style degree adjustment.
  • Now you can fix the pronunciation of a word by speaking the word and recording it. The phonemes can be automatically recognized from your recording. The Recognize by speaking feature is now in public preview.

April 2023 release

Prebuilt Neural TTS Voices

  • The following features of these voices moved from public preview to GA:
Style Text to speech voices
style="chat" en-GB-RyanNeural, es-MX-JorgeNeural, and it-IT-IsabellaNeural
style="cheerful" en-GB-RyanNeural, en-GB-SoniaNeural, es-MX-JorgeNeural, fr-FR-DeniseNeural, fr-FR-HenriNeural, and it-IT-IsabellaNeural
style="sad" en-GB-SoniaNeural, fr-FR-DeniseNeural and fr-FR-HenriNeural
  • Improve the English pronunciation for hi-IN, ta-IN and te-IN voices, now is flighting in public preview regions

For more information, see the language and voice list.

March 2023 release

New features

Speech Synthesis Markup Language (SSML) is updated to support audio effect processor elements that optimize the quality of the synthesized speech output for specific scenarios on devices. Learn more at speech synthesis markup.

Custom neural voice

Added support for the nl-BE locale with Custom neural voice Pro. See the full language and voice list for more information.

Prebuilt Neural TTS Voices

The following voices are now generally available. See the full language and voice list for more information.

Locale (BCP-47) Language Text to speech voices
en-AU English (Australia) en-AU-AnnetteNeural (Female)
en-AU-CarlyNeural (Female)
en-AU-DarrenNeural (Male)
en-AU-DuncanNeural (Male)
en-AU-ElsieNeural (Female)
en-AU-FreyaNeural (Female)
en-AU-JoanneNeural (Female)
en-AU-KenNeural (Male)
en-AU-KimNeural (Female)
en-AU-NeilNeural (Male)
en-AU-TimNeural (Male)
en-AU-TinaNeural (Female)
en-AU-WilliamNeural (Male)
en-GB English (United Kingdom) en-GB-RyanNeural (Male)
en-GB-SoniaNeural (Female)
es-ES Spanish (Spain) es-ES-AbrilNeural (Female)
es-ES-ArnauNeural (Male)
es-ES-DarioNeural (Male)
es-ES-EliasNeural (Male)
es-ES-EstrellaNeural (Female)
es-ES-IreneNeural (Female)
es-ES-LaiaNeural (Female)
es-ES-LiaNeural (Female)
es-ES-NilNeural (Male)
es-ES-SaulNeural (Male)
es-ES-TeoNeural (Male)
es-ES-TrianaNeural (Female)
es-ES-VeraNeural (Female)
es-MX Spanish (Mexico) es-MX-JorgeNeural (Male)
fr-FR French (France) fr-FR-HenriNeural (Male)
it-IT Italian (Italy) it-IT-IsabellaNeural (Female)
ja-JP Japanese (Japan) ja-JP-AoiNeural (Female)
ja-JP-DaichiNeural (Male)
ja-JP-MayuNeural (Female)
ja-JP-NaokiNeural (Male)
ja-JP-ShioriNeural (Female)

Added support for the cheerful style with the de-DE-ConradNeural voice.

February 2023 release

Prebuilt Neural TTS Voices

The following voices are now generally available. See the full language and voice list for more information.

Locale (BCP-47) Language Text to speech voices
zh-CN Chinese (Mandarin, Simplified) zh-CN-XiaomengNeural (Female)
zh-CN-XiaoyiNeural (Female)
zh-CN-XiaozhenNeural (Female)
zh-CN-YunfengNeural (Male)
zh-CN-YunhaoNeural (Male)
zh-CN-YunjianNeural (Male)
zh-CN-YunxiaNeural (Male)
zh-CN-YunzeNeural (Male)
zh-CN-henan Chinese (Zhongyuan Mandarin Henan, Simplified) zh-CN-henan-YundengNeural (Male)

December 2022 release

Batch synthesis REST API (Preview)

The Batch synthesis API is currently in public preview. Once it's generally available, the Long Audio API is deprecated. For more information, see Migrate to batch synthesis API.

November 2022 release

Prebuilt Neural TTS Voices (GA)

The following voices are now generally available. See the full language and voice list for more information.

Locale (BCP-47) Language Text to speech voices
es-MX Spanish (Mexico) es-MX-BeatrizNeural (Female)
es-MX-CandelaNeural (Female)
es-MX-CarlotaNeural (Female)
es-MX-CecilioNeural (Male)
es-MX-GerardoNeural (Male)
es-MX-LarissaNeural (Female)
es-MX-LibertoNeural (Male)
es-MX-LucianoNeural (Male)
es-MX-MarinaNeural (Female)
es-MX-NuriaNeural (Female)
es-MX-PelayoNeural (Male)
es-MX-RenataNeural (Female)
es-MX-YagoNeural (Male)
it-IT Italian (Italy) it-IT-BenignoNeural (Male)
it-IT-CalimeroNeural (Male)
it-IT-CataldoNeural (Male)
it-IT-FabiolaNeural (Female)
it-IT-FiammaNeural (Female)
it-IT-GianniNeural (Male)
it-IT-ImeldaNeural (Female)
it-IT-IrmaNeural (Female)
it-IT-LisandroNeural (Male)
it-IT-PalmiraNeural (Female)
it-IT-PierinaNeural (Female)
it-IT-RinaldoNeural (Male)
pt-BR Portuguese (Brazil) pt-BR-BrendaNeural (Female)
pt-BR-DonatoNeural (Male)
pt-BR-ElzaNeural (Female)
pt-BR-FabioNeural (Male)
pt-BR-GiovannaNeural (Female)
pt-BR-HumbertoNeural (Male)
pt-BR-JulioNeural (Male)
pt-BR-LeilaNeural (Female)
pt-BR-LeticiaNeural (Female)
pt-BR-ManuelaNeural (Female)
pt-BR-NicolauNeural (Male)
pt-BR-ValerioNeural (Male)
pt-BR-YaraNeural (Female)

Custom neural voice

The following locale support is added for Custom neural voice. See the full language and voice list for more information.

  • Added support for the fr-BE locale with custom neural voice Pro.
  • Added support for the es-ES locale with custom neural voice lite.

October 2022 release

Prebuilt Neural TTS Voices (GA)

The following voices are now generally available. See the full language and voice list for more information.

Locale (BCP-47) Language Text to speech voices
eu-ES Basque eu-ES-AinhoaNeural (Female)
eu-ES-AnderNeural (Male)
hy-AM Armenian (Armenia) hy-AM-AnahitNeural (Female)
hy-AM-HaykNeural (Male)

Prebuilt Neural TTS Voices (Preview)

The following voices are now available in public preview. See the full language and voice list for more information.

Locale (BCP-47) Language Text to speech voices
en-AU English (Australia) en-AU-AnnetteNeural(Female)
en-AU-CarlyNeural(Female)
en-AU-DarrenNeural(Male)
en-AU-DuncanNeural(Male)
en-AU-ElsieNeural(Female)
en-AU-FreyaNeural(Female)
en-AU-JoanneNeural(Female)
en-AU-KenNeural(Male)
en-AU-KimNeural(Female)
en-AU-NeilNeural(Male)
en-AU-TimNeural(Male)
en-AU-TinaNeural(Female)
es-ES Spanish (Spain) es-ES-AbrilNeural(Female)
es-ES-AlvaroNeural(Male)
es-ES-ArnauNeural(Male)
es-ES-DarioNeural(Male)
es-ES-EliasNeural(Male)
es-ES-EstrellaNeural(Female)
es-ES-IreneNeural(Female)
es-ES-LaiaNeural(Female)
es-ES-LiaNeural(Female)
es-ES-NilNeural(Male)
es-ES-SaulNeural(Male)
es-ES-TeoNeural(Male)
es-ES-TrianaNeural(Female)
es-ES-VeraNeural(Female)
ja-JP Japanese (Japan) ja-JP-AoiNeural(Female)
ja-JP-DaichiNeural(Male)
ja-JP-MayuNeural(Female)
ja-JP-NaokiNeural(Male)
ja-JP-ShioriNeural(Female)
ko-KR Korean (Korea) ko-KR-BongJinNeural(Male)
ko-KR-GookMinNeural(Male)
ko-KR-JiMinNeural(Female)
ko-KR-SeoHyeonNeural(Female)
ko-KR-SoonBokNeural(Female)
ko-KR-YuJinNeural(Female)
wuu-CN Chinese (Wu, Simplified) wuu-CN-XiaotongNeural (Female)
wuu-CN-YunzheNeural (Male)
yue-CN Chinese (Cantonese, Simplified) yue-CN-XiaoMinNeural (Female)
yue-CN-YunSongNeural (Male)

General TTS voice updates

  • Improved quality for the fil-PH-AngeloNeural and fil-PH-BlessicaNeural voices.
  • Text Normalization rules are updated for voices with the es-CL Spanish (Chile) and uz-UZ Uzbek (Uzbekistan) locales.
  • Added English letters spelling for voices with the sq-AL Albanian (Albania) and az-AZ Azerbaijani (Azerbaijan) locales.
  • Improved English pronunciation for the zh-HK-WanLungNeural voice.
  • Improved question tone for the nl-NL-MaartenNeural and pt-BR-AntonioNeural voices.
  • Added support for the <lang ="en-US"> tag for better English pronunciation with the following voices: de-DE-ConradNeural, de-DE-KatjaNeural, es-ES-AlvaroNeural, es-MX-DaliaNeural, es-MX-JorgeNeural, fr-CA-SylvieNeural, fr-FR-DeniseNeural, fr-FR-HenriNeural, it-IT-DiegoNeural, and it-IT-IsabellaNeural.
  • Added support for the style="chat" tag with the following voices: en-GB-RyanNeural, es-MX-JorgeNeural, and it-IT-IsabellaNeural.
  • Added support for the style="cheerful" tag with the following voices: en-GB-RyanNeural, en-GB-SoniaNeural, es-MX-JorgeNeural, fr-FR-DeniseNeural, fr-FR-HenriNeural, and it-IT-IsabellaNeural.
  • Added support for the style="sad" tag with the following voices: en-GB-SoniaNeural, fr-FR-DeniseNeural and fr-FR-HenriNeural.

September 2022 release

Prebuilt Neural TTS Voice

  • All the prebuilt neural voices have been upgraded to high-fidelity voices with 48kHz sample rate.

August 2022 release

Prebuilt Neural TTS Voice

Released new voices in public preview:

  • Voices for English (United States): en-US-AIGenerate1Neural and en-US-AIGenerate2Neural.
  • Voices for Chinese regional languages: zh-CN-henan-YundengNeural, zh-CN-shaanxi-XiaoniNeural, and zh-CN-shandong-YunxiangNeural.

For more information, see the language and voice list.

July 2022 release

Prebuilt Neural TTS Voice

  • Added 5 new voices of zh-CN Chinese (Mandarin, Simplified) and 1 new voice of en-US English (United States) in Public Preview. See full language and voice list.
Language Locale Gender Voice name Style support
Chinese (Mandarin, Simplified) zh-CN Female zh-CN-XiaomengNeural New General, multiple styles available using SSML
Chinese (Mandarin, Simplified) zh-CN Female zh-CN-XiaoyiNeural New General, multiple styles available using SSML
Chinese (Mandarin, Simplified) zh-CN Female zh-CN-XiaozhenNeural New General, multiple styles available using SSML
Chinese (Mandarin, Simplified) zh-CN Male zh-CN-YunxiaNeural New General, multiple styles available using SSML
Chinese (Mandarin, Simplified) zh-CN Male zh-CN-YunzeNeural New General, multiple styles available using SSML
English (United States) en-US Male en-US-RogerNeural New General
  • Supported styles and roles for the added neural voices.
Voice Styles Style degree Roles
zh-CN-XiaomengNeural Public preview chat Supported
zh-CN-XiaoyiNeural Public preview affectionate, angry, cheerful, disgruntled, embarrassed, fearful, gentle, sad, serious Supported
zh-CN-XiaozhenNeural Public preview angry, cheerful, disgruntled, fearful, sad, serious Supported
zh-CN-YunxiaNeural Public preview angry, calm, cheerful, fearful, sad Supported
zh-CN-YunzeNeural Public preview angry, calm, cheerful, depressed, disgruntled, documentary-narration, fearful, sad, serious Supported Supported

Get facial position with viseme

June 2022 release

Prebuilt Neural TTS Voice

  • Added 9 new languages and variants for Neural text to speech:
Language Locale Gender Voice name Style support
Arabic (Lebanon) ar-LB Female ar-LB-LaylaNeural New General
Arabic (Lebanon) ar-LB Male ar-LB-RamiNeural New General
Arabic (Oman) ar-OM Female ar-OM-AyshaNeural New General
Arabic (Oman) ar-OM Male ar-OM-AbdullahNeural New General
Azerbaijani (Azerbaijan) az-AZ Female az-AZ-BabekNeural New General
Azerbaijani (Azerbaijan) az-AZ Male az-AZ-BanuNeural New General
Bosnian (Bosnia and Herzegovina) bs-BA Female bs-BA-VesnaNeural New General
Bosnian (Bosnia and Herzegovina) bs-BA Male bs-BA-GoranNeural New General
Georgian (Georgia) ka-GE Female ka-GE-EkaNeural New General
Georgian (Georgia) ka-GE Male ka-GE-GiorgiNeural New General
Mongolian (Mongolia) mn-MN Female mn-MN-YesuiNeural New General
Mongolian (Mongolia) mn-MN Male mn-MN-BataaNeural New General
Nepali (Nepal) ne-NP Female ne-NP-HemkalaNeural New General
Nepali (Nepal) ne-NP Male ne-NP-SagarNeural New General
Albanian (Albania) sq-AL Female sq-AL-AnilaNeural New General
Albanian (Albania) sq-AL Male sq-AL-IlirNeural New General
Tamil (Malaysia) ta-MY Female ta-MY-KaniNeural New General
Tamil (Malaysia) ta-MY Male ta-MY-SuryaNeural New General
  • GA 36 voices from Public Preview for en-GB English (United Kingdom), fr-FR French (France) and de-DE German (Germany):
Language Locale Gender Voice name Style support
English (United Kingdom) en-GB Female en-GB-AbbiNeural General
English (United Kingdom) en-GB Female en-GB-BellaNeural General
English (United Kingdom) en-GB Female en-GB-HollieNeural General
English (United Kingdom) en-GB Female en-GB-MaisieNeural General, child voice
English (United Kingdom) en-GB Female en-GB-OliviaNeural General
English (United Kingdom) en-GB Female en-GB-SoniaNeural General
English (United Kingdom) en-GB Male en-GB-AlfieNeural General
English (United Kingdom) en-GB Male en-GB-ElliotNeural General
English (United Kingdom) en-GB Male en-GB-EthanNeural General
English (United Kingdom) en-GB Male en-GB-NoahNeural General
English (United Kingdom) en-GB Male en-GB-OliverNeural General
English (United Kingdom) en-GB Male en-GB-ThomasNeural General
French (France) fr-FR Female fr-FR-BrigitteNeural General
French (France) fr-FR Female fr-FR-CelesteNeural General
French (France) fr-FR Female fr-FR-CoralieNeural General
French (France) fr-FR Female fr-FR-EloiseNeural General, child voice
French (France) fr-FR Female fr-FR-JacquelineNeural General
French (France) fr-FR Female fr-FR-JosephineNeural General
French (France) fr-FR Female fr-FR-YvetteNeural General
French (France) fr-FR Male fr-FR-AlainNeural General
French (France) fr-FR Male fr-FR-ClaudeNeural General
French (France) fr-FR Male fr-FR-JeromeNeural General
French (France) fr-FR Male fr-FR-MauriceNeural General
French (France) fr-FR Male fr-FR-YvesNeural General
German (Germany) de-DE Female de-DE-AmalaNeural General
German (Germany) de-DE Female de-DE-ElkeNeural General
German (Germany) de-DE Female de-DE-GiselaNeural General, child voice
German (Germany) de-DE Female de-DE-KlarissaNeural General
German (Germany) de-DE Female de-DE-LouisaNeural General
German (Germany) de-DE Female de-DE-MajaNeural General
German (Germany) de-DE Female de-DE-TanjaNeural General
German (Germany) de-DE Male de-DE-BerndNeural General
German (Germany) de-DE Male de-DE-ChristophNeural General
German (Germany) de-DE Male de-DE-KasperNeural General
German (Germany) de-DE Male de-DE-KillianNeural General
German (Germany) de-DE Male de-DE-KlausNeural General
German (Germany) de-DE Male de-DE-RalfNeural General
  • Added 40 new voices of es-MX Spanish (Mexico), it-IT Italian (Italy), pt-BR Portuguese (Brazil) and 2 accents for zh-CN Chinese (Mandarin, Simplified) in Public Preview:
Language Locale Gender Voice name Style support
Spanish (Mexico) es-MX Female es-MX-BeatrizNeural New General
Spanish (Mexico) es-MX Female es-MX-CarlotaNeural New General
Spanish (Mexico) es-MX Female es-MX-NuriaNeural New General
Spanish (Mexico) es-MX Female es-MX-RenataNeural New General
Spanish (Mexico) es-MX Female es-MX-LarissaNeural New General
Spanish (Mexico) es-MX Female es-MX-CandelaNeural New General
Spanish (Mexico) es-MX Female es-MX-MarinaNeural New General
Italian (Italy) it-IT Female it-IT-FiammaNeural New General
Italian (Italy) it-IT Female it-IT-IrmaNeural New General
Italian (Italy) it-IT Female it-IT-FabiolaNeural New General
Italian (Italy) it-IT Female it-IT-PalmiraNeural New General
Italian (Italy) it-IT Female it-IT-ImeldaNeural New General
Italian (Italy) it-IT Female it-IT-PierinaNeural New General
Portuguese (Brazil) pt-BR Female pt-BR-ElzaNeural New General
Portuguese (Brazil) pt-BR Female pt-BR-ManuelaNeural New General
Portuguese (Brazil) pt-BR Female pt-BR-BrendaNeural New General
Portuguese (Brazil) pt-BR Female pt-BR-LeilaNeural New General
Portuguese (Brazil) pt-BR Female pt-BR-YaraNeural New General
Portuguese (Brazil) pt-BR Female pt-BR-GiovannaNeural New General
Portuguese (Brazil) pt-BR Female pt-BR-LeticiaNeural New General
Spanish (Mexico) es-MX Male es-MX-CecilioNeural New General
Spanish (Mexico) es-MX Male es-MX-LibertoNeural New General
Spanish (Mexico) es-MX Male es-MX-LucianoNeural New General
Spanish (Mexico) es-MX Male es-MX-PelayoNeural New General
Spanish (Mexico) es-MX Male es-MX-YagoNeural New General
Spanish (Mexico) es-MX Male es-MX-GerardoNeural New General
Italian (Italy) it-IT Male it-IT-BenignoNeural New General
Italian (Italy) it-IT Male it-IT-CataldoNeural New General
Italian (Italy) it-IT Male it-IT-LisandroNeural New General
Italian (Italy) it-IT Male it-IT-CalimeroNeural New General
Italian (Italy) it-IT Male it-IT-RinaldoNeural New General
Italian (Italy) it-IT Male it-IT-GianniNeural New General
Portuguese (Brazil) pt-BR Male pt-BR-DonatoNeural New General
Portuguese (Brazil) pt-BR Male pt-BR-HumbertoNeural New General
Portuguese (Brazil) pt-BR Male pt-BR-FabioNeural New General
Portuguese (Brazil) pt-BR Male pt-BR-JulioNeural New General
Portuguese (Brazil) pt-BR Male pt-BR-ValerioNeural New General
Portuguese (Brazil) pt-BR Male pt-BR-NicolauNeural New General
Chinese (Mandarin, Simplified) zh-CN-sichuan Male zh-CN-sichuan-YunxiSichuanNeural New General, Sichuan accent
Chinese (Mandarin, Simplified) zh-CN-liaoning Female zh-CN-liaoning-XiaobeiNeural New General, Liaoning accent
  • Improved quality for en-SG-LunaNeural and en-SG-WayneNeural
  • 48kHz output support for Public Preview with en-US-JennyNeural, en-US-AriaNeural, and zh-CN-XiaoxiaoNeural

Custom neural voice

Audio Content Creation tool

  • Supported pagination.
  • Enabled to sort globally by name, file type, and update time on work file page.

May 2022 release

Prebuilt Neural TTS Voice

  • Released 5 new voices in public preview with multiple styles to enrich the variety in American English. See full language and voice list.
  • Support these new styles Angry, Excited, Friendly, Hopeful, Sad, Shouting, Unfriendly, Terrified and Whispering in public preview for en-US-AriaNeural.
  • Support these new styles Angry, Cheerful, Excited, Friendly, Hopeful, Sad, Shouting, Unfriendly, Terrified and Whispering in public preview for en-US-GuyNeural, en-US-JennyNeural.
  • Support these new styles Excited, Friendly, Hopeful, Shouting, Unfriendly, Terrified and Whispering in public preview for en-US-SaraNeural. See voice styles and roles.
  • Released new voices zh-CN-YunjianNeural, zh-CN-YunhaoNeural, and zh-CN-YunfengNeural in public preview. See full language and voice list.
  • Support 2 new styles sports-commentary, sports-commentary-excited in public preview for zh-CN-YunjianNeural. See voice styles and roles.
  • Support 1 new style advertisement-upbeat in public preview for zh-CN-YunhaoNeural. See voice styles and roles.
  • The cheerful and sad styles for fr-FR-DeniseNeural are generally available in all regions.
  • SSML updated to support MathML elements for en-US and en-AU voices. Learn more at speech synthesis markup.

Custom neural voice

Audio Content Creation tool

  • Enabled to try out Audio Content Creation tool without signing in.
  • Improved layout for adjusting phonemes.
  • Enhanced performance: Specified the maximum number (200) of files to be uploaded at one time.
  • Enhanced performance: Specified the maximum directory depth level (5 levels).

March 2022 release

Prebuilt Neural TTS Voice

Custom neural voice

Audio Content Creation tool

  • Updated the file size and concurrency limit for free-tier (F0) resources to make the experience consistent with the Speech SDK and APIs. See speech service quotas and limits.

February 2022 release

Custom neural voice

Audio Content Creation tool

  • Removed the output length limit for downloading audios.

January 2022 release

New languages and voices

Added 10 new languages and variants for Neural text to speech:

Language Locale Gender Voice name Style support
Bengali (India) bn-IN Female bn-IN-TanishaaNeural New General
Bengali (India) bn-IN Male bn-IN-BashkarNeural New General
Icelandic (Iceland) is-IS Female is-IS-GudrunNeural New General
Icelandic (Iceland) is-IS Male is-IS-GunnarNeural New General
Kannada (India) kn-IN Female kn-IN-SapnaNeural New General
Kannada (India) kn-IN Male kn-IN-GaganNeural New General
Kazakh (Kazakhstan) kk-KZ Female kk-KZ-AigulNeural New General
Kazakh (Kazakhstan) kk-KZ Male kk-KZ-DauletNeural New General
Lao (Laos) lo-LA Female lo-LA-KeomanyNeural New General
Lao (Laos) lo-LA Male lo-LA-ChanthavongNeural New General
Macedonian (Republic of North Macedonia) mk-MK Female mk-MK-MarijaNeural New General
Macedonian (Republic of North Macedonia) mk-MK Male mk-MK-AleksandarNeural New General
Malayalam (India) ml-IN Female ml-IN-SobhanaNeural New General
Malayalam (India) ml-IN Male ml-IN-MidhunNeural New General
Pashto (Afghanistan) ps-AF Female ps-AF-LatifaNeural New General
Pashto (Afghanistan) ps-AF Male ps-AF-GulNawazNeural New General
Serbian (Serbia, Cyrillic) sr-RS Female sr-RS-SophieNeural New General
Serbian (Serbia, Cyrillic) sr-RS Male sr-RS-NicholasNeural New General
Sinhala (Sri Lanka) si-LK Female si-LK-ThiliniNeural New General
Sinhala (Sri Lanka) si-LK Male si-LK-SameeraNeural New General

For the full list of available voices, see Language support.

New voices in preview

Added new voices for en-GB, fr-FR and de-DE in preview:

Language Locale Gender Voice name Style support
English (United Kingdom) en-GB Female en-GB-AbbiNeural New General
English (United Kingdom) en-GB Female en-GB-BellaNeural New General
English (United Kingdom) en-GB Female en-GB-HollieNeural New General
English (United Kingdom) en-GB Female en-GB-OliviaNeural New General
English (United Kingdom) en-GB Girl en-GB-MaisieNeural New General
English (United Kingdom) en-GB Male en-GB-AlfieNeural New General
English (United Kingdom) en-GB Male en-GB-ElliotNeural New General
English (United Kingdom) en-GB Male en-GB-EthanNeural New General
English (United Kingdom) en-GB Male en-GB-NoahNeural New General
English (United Kingdom) en-GB Male en-GB-OliverNeural New General
English (United Kingdom) en-GB Male en-GB-ThomasNeural New General
French (France) fr-FR Female fr-FR-BrigitteNeural New General
French (France) fr-FR Female fr-FR-CelesteNeural New General
French (France) fr-FR Female fr-FR-CoralieNeural New General
French (France) fr-FR Female fr-FR-JacquelineNeural New General
French (France) fr-FR Female fr-FR-JosephineNeural New General
French (France) fr-FR Female fr-FR-YvetteNeural New General
French (France) fr-FR Girl fr-FR-EloiseNeural New General
French (France) fr-FR Male fr-FR-AlainNeural New General
French (France) fr-FR Male fr-FR-ClaudeNeural New General
French (France) fr-FR Male fr-FR-JeromeNeural New General
French (France) fr-FR Male fr-FR-MauriceNeural New General
French (France) fr-FR Male fr-FR-YvesNeural New General
German (Germany) de-DE Female de-DE-AmalaNeural New General
German (Germany) de-DE Female de-DE-ElkeNeural New General
German (Germany) de-DE Female de-DE-KlarissaNeural New General
German (Germany) de-DE Female de-DE-LouisaNeural New General
German (Germany) de-DE Female de-DE-MajaNeural New General
German (Germany) de-DE Female de-DE-TanjaNeural New General
German (Germany) de-DE Girl de-DE-GiselaNeural New General
German (Germany) de-DE Male de-DE-BerndNeural New General
German (Germany) de-DE Male de-DE-ChristophNeural New General
German (Germany) de-DE Male de-DE-KasperNeural New General
German (Germany) de-DE Male de-DE-KillianNeural New General
German (Germany) de-DE Male de-DE-KlausNeural New General
German (Germany) de-DE Male de-DE-RalfNeural New General

For the full list of available voices, see Language support.

Pronunciation accuracy

  • Improved English word pronunciation for all he-IL voices.
  • Improved word-level pronunciation accuracy for cs-CZ and da-DK.
  • Improved Arabic diacritics and Hebrew Nikud handling.
  • Improved entity reading for ja-JP

Speech Studio

  • Custom neural voice: enabled additional model testing using the batch API (long audio API)
  • Audio Content Creation: enabled more output formats

October 2021 release

New languages and voices

Added 49 new languages and 98 voices for Neural text to speech:

Adri in af-ZA Afrikaans (South Africa), Willem in af-ZA Afrikaans (South Africa), Mekdes in am-ET Amharic (Ethiopia), Ameha in am-ET Amharic (Ethiopia), Fatima in ar-AE Arabic (United Arab Emirates), Hamdan in ar-AE Arabic (United Arab Emirates), Laila in ar-BH Arabic (Bahrain), Ali in ar-BH Arabic (Bahrain), Amina in ar-DZ Arabic (Algeria), Ismael in ar-DZ Arabic (Algeria), Rana in ar-IQ Arabic (Iraq), Bassel in ar-IQ Arabic (Iraq), Sana in ar-JO Arabic (Jordan), Taim in ar-JO Arabic (Jordan), Noura in ar-KW Arabic (Kuwait), Fahed in ar-KW Arabic (Kuwait), Iman in ar-LY Arabic (Libya), Omar in ar-LY Arabic (Libya), Mouna in ar-MA Arabic (Morocco), Jamal in ar-MA Arabic (Morocco), Amal in ar-QA Arabic (Qatar), Moaz in ar-QA Arabic (Qatar), Amany in ar-SY Arabic (Syria), Laith in ar-SY Arabic (Syria), Reem in ar-TN Arabic (Tunisia), Hedi in ar-TN Arabic (Tunisia), Maryam in ar-YE Arabic (Yemen), Saleh in ar-YE Arabic (Yemen), Nabanita in bn-BD Bangla (Bangladesh), Pradeep in bn-BD Bangla (Bangladesh), Asilia in en-KE English (Kenya), Chilemba in en-KE English (Kenya), Ezinne in en-NG English (Nigeria), Abeo in en-NG English (Nigeria), Imani in en-TZ English (Tanzania), Elimu in en-TZ English (Tanzania), Sofia in es-BO Spanish (Bolivia), Marcelo in es-BO Spanish (Bolivia), Catalina in es-CL Spanish (Chile), Lorenzo in es-CL Spanish (Chile), Maria in es-CR Spanish (Costa Rica), Juan in es-CR Spanish (Costa Rica), Belkys in es-CU Spanish (Cuba), Manuel in es-CU Spanish (Cuba), Ramona in es-DO Spanish (Dominican Republic), Emilio in es-DO Spanish (Dominican Republic), Andrea in es-EC Spanish (Ecuador), Luis in es-EC Spanish (Ecuador), Teresa in es-GQ Spanish (Equatorial Guinea), Javier in es-GQ Spanish (Equatorial Guinea), Marta in es-GT Spanish (Guatemala), Andres in es-GT Spanish (Guatemala), Karla in es-HN Spanish (Honduras), Carlos in es-HN Spanish (Honduras), Yolanda in es-NI Spanish (Nicaragua), Federico in es-NI Spanish (Nicaragua), Margarita in es-PA Spanish (Panama), Roberto in es-PA Spanish (Panama), Camila in es-PE Spanish (Peru), Alex in es-PE Spanish (Peru), Karina in es-PR Spanish (Puerto Rico), Victor in es-PR Spanish (Puerto Rico), Tania in es-PY Spanish (Paraguay), Mario in es-PY Spanish (Paraguay), Lorena in es-SV Spanish (El Salvador), Rodrigo in es-SV Spanish (El Salvador), Valentina in es-UY Spanish (Uruguay), Mateo in es-UY Spanish (Uruguay), Paola in es-VE Spanish (Venezuela), Sebastian in es-VE Spanish (Venezuela), Dilara in fa-IR Persian (Iran), Farid in fa-IR Persian (Iran), Blessica in fil-PH Filipino (Philippines), Angelo in fil-PH Filipino (Philippines), Sabela in gl-ES Galician, Roi in gl-ES Galician, Siti in jv-ID Javanese (Indonesia), Dimas in jv-ID Javanese (Indonesia), Sreymom in km-KH Khmer (Cambodia), Piseth in km-KH Khmer (Cambodia), Nilar in my-MM Burmese (Myanmar), Thiha in my-MM Burmese (Myanmar), Ubax in so-SO Somali (Somalia), Muuse in so-SO Somali (Somalia), Tuti in su-ID Sundanese (Indonesia), Jajang in su-ID Sundanese (Indonesia), Rehema in sw-TZ Swahili (Tanzania), Daudi in sw-TZ Swahili (Tanzania), Saranya in ta-LK Tamil (Sri Lanka), Kumar in ta-LK Tamil (Sri Lanka), Venba in ta-SG Tamil (Singapore), Anbu in ta-SG Tamil (Singapore), Gul in ur-IN Urdu (India), Salman in ur-IN Urdu (India), Madina in uz-UZ Uzbek (Uzbekistan), Sardor in uz-UZ Uzbek (Uzbekistan), Thando in zu-ZA Zulu (South Africa), Themba in zu-ZA Zulu (South Africa).

September 2021 release

  • New chatbot voice in en-US English (US): Sara represents a young female adult that talks more casually and fits best for the chatbot scenarios.
  • New styles added for ja-JP Japanese voice Nanami: Three new styles are now available with Nanami: chat, customer service, and cheerful.
  • Overall pronunciation improvement: Ardi in id-ID, Premwadee in th-TH, Christel in da-DK, HoaiMy and NamMinh in vi-VN.
  • Two new voices in zh-CN Chinese (Mandarin, China) in preview: Xiaochen & Xiaoyan, optimized for spontaneous speech and customer service scenarios.

July 2021 release

Neural text to speech updates

  • Reduced pronunciation errors in Hebrew by 20%.

Speech Studio updates

  • Custom neural voice: Updated the training pipeline to UniTTSv3 with which the model quality is improved while training time is reduced by 50% for acoustic models.
  • Audio Content Creation: Fixed the "Export" performance issue and the bug on custom neural voice selection.

June 2021 release

Speech Studio updates

  • Custom neural voice: Custom neural voice training extended to support South East Asia. New features released to support data uploading status checking.
  • Audio Content Creation: Released a new feature to support custom lexicon. With this feature, users can easily create their lexicon files and define the customized pronunciation for their audio output.

May 2021 release

New languages and voices added for neural TTS

  • Ten new languages introduced - 20 new voices in 10 new locales are added into the neural TTS language list: Yan in en-HK English (Hongkong), Sam in en-HK English (Hongkong), Molly in en-NZ English (New Zealand), Mitchell in en-NZ English (New Zealand), Luna in en-SG English (Singapore), Wayne in en-SG English (Singapore), Leah in en-ZA English (South Africa), Luke in en-ZA English (South Africa), Dhwani in gu-IN Gujarati (India), Niranjan in gu-IN Gujarati (India), Aarohi in mr-IN Marathi (India), Manohar in mr-IN Marathi (India), Elena in es-AR Spanish (Argentina), Tomas in es-AR Spanish (Argentina), Salome in es-CO Spanish (Colombia), Gonzalo in es-CO Spanish (Colombia), Paloma in es-US Spanish (US), Alonso in es-US Spanish (US), Zuri in sw-KE Swahili (Kenya), Rafiki in sw-KE Swahili (Kenya).

  • Eleven new en-US voices in preview - 11 new en-US voices in preview are added to American English, they are Ashley, Amber, Ana, Brandon, Christopher, Cora, Elizabeth, Eric, Michelle, Monica, Jacob.

  • Five zh-CN Chinese (Mandarin, Simplified) voices are generally available - 5 Chinese (Mandarin, Simplified) voices are changed from preview to generally available. They are Yunxi, Xiaomo, Xiaoman, Xiaoxuan, Xiaorui. Now, these voices are available in all regions. Yunxi is added with a new 'assistant' style, which is suitable for chat bot and voice agent. Xiaomo's voice styles are refined to be more natural and featured.

April 2021 release

Neural text to speech is available across 21 regions

  • Twelve new regions added - Neural text to speech is now available in these new 12 regions: Japan East, Japan West, Korea Central, North Central US, North Europe, South Central US, Southeast Asia, UK South, west Central US, West Europe, West US, West US 2. Check here for full list of 21 supported regions.

March 2021 release

New languages and voices added for neural TTS

  • Six new languages introduced - 12 new voices in 6 new locales are added into the neural TTS language list: Nia in cy-GB Welsh (United Kingdom), Aled in cy-GB Welsh (United Kingdom), Rosa in en-PH English (Philippines), James in en-PH English (Philippines), Charline in fr-BE French (Belgium), Gerard in fr-BE French (Belgium), Dena in nl-BE Dutch (Belgium), Arnaud in nl-BE Dutch (Belgium), Polina in uk-UA Ukrainian (Ukraine), Ostap in uk-UA Ukrainian (Ukraine), Uzma in ur-PK Urdu (Pakistan), Asad in ur-PK Urdu (Pakistan).

  • Five languages from preview to GA - 10 voices in 5 locales introduced in November now are GA: Kert in et-EE Estonian (Estonia), Colm in ga-IE Irish (Ireland), Nils in lv-LV Latvian (Latvia), Leonas in lt-LT Lithuanian (Lithuania), Joseph in mt-MT Maltese (Malta).

  • New male voice added for French (Canada) - A new voice Antoine is available for fr-CA French (Canada).

  • Quality improvement - Pronunciation error rate reduction on hu-HU Hungarian - 48.17%, nb-NO Norwegian - 52.76%, nl-NL Dutch (Netherlands) - 22.11%.

With this release, we now support a total of 142 neural voices across 60 languages/locales. In addition, over 70 standard voices are available in 49 languages/locales. Visit Language support for the full list.

Get facial pose events to animate characters

Neural Text to speech now includes the viseme event. Viseme events allow users to get a sequence of facial poses along with synthesized speech. Visemes can be used to control the movement of 2D and 3D avatar models, matching mouth movements to synthesized speech. Viseme events are only available for en-US-AriaNeural voice at this time.

Add the bookmark element in Speech Synthesis Markup Language (SSML)

The bookmark element allows you to insert custom markers in SSML to get the offset of each marker in the audio stream. It can be used to reference a specific location in the text or tag sequence.

February 2021 release

Custom neural voice GA

Custom neural voice is GA in February in 13 languages: Chinese (Mandarin, Simplified), English (Australia), English (India), English (United Kingdom), English (United States), French (Canada), French (France), German (Germany), Italian (Italy), Japanese (Japan), Korean (Korea), Portuguese (Brazil), Spanish (Mexico), and Spanish (Spain). Learn more about what is custom neural voice and how to use it responsibly. Custom neural voice feature requires registration and Microsoft may limit access based on Microsoft's eligibility criteria. Learn more about the limited access.

December 2020 release

New neural voices in GA and preview

Released 51 new voices for a total of 129 neural voices across 54 languages/locales:

  • 46 new voices in GA locales: Shakir in ar-EG Arabic (Egypt), Hamed in ar-SA Arabic (Saudi Arabia), Borislav in bg-BG Bulgarian (Bulgaria), Joana in ca-ES Catalan, Antonin in cs-CZ Czech (Czech Republic), Jeppe in da-DK Danish (Denmark), Jonas in de-AT German (Austria), Jan in de-CH German (Switzerland), Nestoras in el-GR Greek (Greece), Liam in en-CA English (Canada), Connor in en-IE English (Ireland), Madhur in en-IN Hindi (India), Mohan in en-IN Telugu (India), Prabhat in en-IN English (India), Valluvar in en-IN Tamil (India), Enric in es-ES Catalan, Kert in et-EE Estonian (Estonia), Harri in fi-FI Finnish (Finland), Selma in fi-FI Finnish (Finland), Fabrice in fr-CH French (Switzerland), Colm in ga-IE Irish (Ireland), Avri in he-IL Hebrew (Israel), Srecko in hr-HR Croatian (Croatia), Tamas in hu-HU Hungarian (Hungary), Gadis in id-ID Indonesian (Indonesia), Leonas in lt-LT Lithuanian (Lithuania), Nils in lv-LV Latvian (Latvia), Osman in ms-MY Malay (Malaysia), Joseph in mt-MT Maltese (Malta), Finn in nb-NO Norwegian, Bokmål (Norway), Pernille in nb-NO Norwegian, Bokmål (Norway), Fenna in nl-NL Dutch (Netherlands), Maarten in nl-NL Dutch (Netherlands), Agnieszka in pl-PL Polish (Poland), Marek in pl-PL Polish (Poland), Duarte in pt-BR Portuguese (Brazil), Raquel in pt-PT Portuguese (Potugal), Emil in ro-RO Romanian (Romania), Dmitry in ru-RU Russian (Russia), Svetlana in ru-RU Russian (Russia), Lukas in sk-SK Slovak (Slovakia), Rok in sl-SI Slovenian (Slovenia), Mattias in sv-SE Swedish (Sweden), Sofie in sv-SE Swedish (Sweden), Niwat in th-TH Thai (Thailand), Ahmet in tr-TR Turkish (Türkiye), NamMinh in vi-VN Vietnamese (Vietnam), HsiaoChen in zh-TW Taiwanese Mandarin (Taiwan), YunJhe in zh-TW Taiwanese Mandarin (Taiwan), HiuMaan in zh-HK Chinese Cantonese (Hong Kong Special Administrative Region), WanLung in zh-HK Chinese Cantonese (Hong Kong SAR).

  • 5 new voices in preview locales: Kert in et-EE Estonian (Estonia), Colm in ga-IE Irish (Ireland), Nils in lv-LV Latvian (Latvia), Leonas in lt-LT Lithuanian (Lithuania), Joseph in mt-MT Maltese (Malta).

With this release, we now support a total of 129 neural voices across 54 languages/locales. In addition, over 70 standard voices are available in 49 languages/locales. Visit Language support for the full list.

Updates for Audio Content Creation

  • Improved voice selection UI with voice categories and detailed voice descriptions.
  • Enabled intonation tuning for all neural voices across different languages.
  • Automated the UI localization based on the language of the browser.
  • Enabled StyleDegree controls for all zh-CN Neural voices. Visit the Audio Content Creation tool to check out the new features.

Updates for zh-CN voices

  • Updated all zh-CN neural voices to support English speaking.
  • Enabled all zh-CN neural voices to support intonation adjustment. SSML or Audio Content Creation tool can be used to adjust for the best intonation.
  • Updated all zh-CN multi-style neural voices to support StyleDegree control. Emotion intensity (soft or strong) is adjustable.
  • Updated zh-CN-YunyeNeural to support multiple styles which can perform different emotions.

November 2020 release

New locales and voices in preview

  • Five new voices and languages are introduced to the Neural text to speech portfolio. They are: Grace in Maltese (Malta), Ona in Lithuanian (Lithuania), Anu in Estonian (Estonia), Orla in Irish (Ireland) and Everita in Latvian (Latvia).
  • Five new zh-CN voices with multiple styles and roles support: Xiaohan, Xiaomo, Xiaorui, Xiaoxuan and Yunxi.

These voices are available in public preview in three Azure regions: EastUS, SouthEastAsia and WestEurope.

Neural text to speech Container GA

  • With Neural text to speech Container, developers can run speech synthesis with the most natural digital voices in their own environment for specific security and data governance requirements. Check how to install Speech Containers.

New features

  • Custom voice: enabled users to copy a voice model from one region to another; supported endpoint suspension and resuming. Go to the Azure portal here.
  • SSML silence tag support.
  • General TTS voice quality improvements: Improved word-level pronunciation accuracy in nb-NO. Reduced 53% pronunciation error.

Read more at this tech blog.

October 2020 release

New features

General TTS voice quality improvements

  • Improved word-level pronunciation accuracy in pl-PL (error rate reduction: 51%) and fi-FI (error rate reduction: 58%)
  • Improved ja-JP single word reading for the dictionary scenario. Reduced pronunciation error by 80%.
  • zh-CN-XiaoxiaoNeural: Improved sentiment/CustomerService/Newscast/Cheerful/Angry style voice quality.
  • zh-CN: Improved Erhua pronunciation and light tone and refined space prosody, which greatly improves intelligibility.

September 2020 release

New features

  • Neural text to speech

    • Extended to support 18 new languages/locales. They are Bulgarian, Czech, German (Austria), German (Switzerland), Greek, English (Ireland), French (Switzerland), Hebrew, Croatian, Hungarian, Indonesian, Malay, Romanian, Slovak, Slovenian, Tamil, Telugu and Vietnamese.
    • Released 14 new voices to enrich the variety in the existing languages. See full language and voice list.
    • New speaking styles for en-US and zh-CN voices. Jenny, the new voice in English (US), supports chatbot, customer service, and assistant styles. 10 new speaking styles are available with our zh-CN voice, XiaoXiao. In addition, the XiaoXiao neural voice supports StyleDegree tuning. See how to use the speaking styles in SSML.
  • Containers: Neural text to speech Container released in public preview with 16 voices available in 14 languages. Learn more on how to deploy Speech Containers for Neural text to speech

Read the full announcement of the TTS updates for Ignite 2020

August 2020 release

New features

  • Neural text to speech: new speaking style for en-US Aria voice. AriaNeural can sound like a news caster when reading news. The 'newscast-formal' style sounds more serious, while the 'newscast-casual' style is more relaxed and informal. See how to use the speaking styles in SSML.

  • Custom voice: a new feature is released to automatically check training data quality. When you upload your data, the system will examine various aspects of your audio and transcript data, and automatically fix or filter issues to improve the quality of the voice model. This covers the volume of your audio, the noise level, the pronunciation accuracy of speech, the alignment of speech with the normalized text, silence in the audio, in addition to the audio and script format.

  • Audio Content Creation: a set of new features to enable more powerful voice tuning and audio management capabilities.

    • Pronunciation: the pronunciation tuning feature is updated to the latest phoneme set. You can pick the right phoneme element from the library and refine the pronunciation of the words you have selected.

    • Download: The audio "Download"/"Export" feature is enhanced to support generating audio by paragraph. You can edit content in the same file/SSML, while generating multiple audio outputs. The file structure of "Download" is refined as well. Now, you can easily get all audio files in one folder.

    • Task status: The multi-file export experience is improved. When you export multiple files in the past, if one of the files has failed, the entire task will fail. But now, all other files will be successfully exported. The task report is enriched with more detailed and structured information. You can check the logs for all failed files and sentences now with the report.

    • SSML documentation: linked to SSML document to help you check the rules for how to use all tuning features.

  • The Voice List API is updated to include a user-friendly display name and the speaking styles supported for neural voices.

General TTS voice quality improvements

  • Reduced word-level pronunciation error % for ru-RU (errors reduced by 56%) and sv-SE (errors reduced by 49%)

  • Improved polyphony word reading on en-US neural voices by 40%. Examples of polyphony words include "read", "live", "content", "record", "object", etc.

  • Improved the naturalness of the question tone in fr-FR. MOS (Mean Opinion Score) gain: +0.28

  • Updated the vocoders for the following voices, with fidelity improvements and overall performance speed-up by 40%.

    Locale Voice
    en-GB Mia
    es-MX Dalia
    fr-CA Sylvie
    fr-FR Denise
    ja-JP Nanami
    ko-KR Sun-Hi

Bug fixes

  • Fixed a number of bugs with the Audio Content Creation tool
    • Fixed issue with auto refreshing.
    • Fixed issues with voice styles in zh-CN in the South East Asia region.
    • Fixed stability issue, including an export error with the 'break' tag, and errors in punctuation.