HMM-based Speech Synthesis: Fundamentals and Its Recent Advances

Автор: Microsoft Research

Загружено: 2016-07-27

Просмотров: 7621

Описание: The task of speech synthesis is to convert normal language text into speech. In recent years, hidden Markov model (HMM) has been successfully applied to acoustic modeling for speech synthesis, and HMM-based parametric speech synthesis has become a mainstream speech synthesis method. This method is able to synthesize highly intelligible and smooth speech sounds. Another significant advantage of this model-based parametric approach is that it makes speech synthesis far more flexible compared to the conventional unit selection and waveform concatenation approach. This talk will first introduce the overall HMM synthesis system architecture developed at USTC. Then, some key techniques will be described, including the vocoder, acoustic modeling, parameter generation algorithm, MSD-HMM for F0 modeling, context-dependent model training, etc. Our method will be compared with the unit selection approach and its flexibility in controlling voice characteristics will also be presented. The second part of this talk will describe some recent advances of HMM-based speech synthesis at the USTC speech group. The methods to be described include: 1) articulatory control of HMM-based speech synthesis, which further improves the flexibility of HMM-based speech synthesis by integrating phonetic knowledge, 2) LPS-GV and minimum KLD based parameter generation, which alleviates the over-smoothing of generated spectral features and improves the naturalness of synthetic speech, and 3) hybrid HMM-based/unit-selection approach which achieves excellent performance in the Blizzard Challenge speech synthesis evaluation events of recent years.

Не удается загрузить Youtube-плеер. Проверьте блокировку Youtube в вашей сети.
Повторяем попытку...

HMM-based Speech Synthesis: Fundamentals and Its Recent Advances

Доступные форматы для скачивания:

Скачать видео

Информация по загрузке:

Скачать аудио

Похожие видео

Automatic Speech Recognition - An Overview

Automatic Speech Recognition - An Overview

Дружественное введение в теорему Байеса и скрытые марковские модели

Дружественное введение в теорему Байеса и скрытые марковские модели

SANE2018 | Yu Zhang - Towards End-to-end Speech Synthesis

SANE2018 | Yu Zhang - Towards End-to-end Speech Synthesis

Microsoft Research Forum | Season 2, Episode 3

Microsoft Research Forum | Season 2, Episode 3

Как Сделать Настольный ЭЛЕКТРОЭРОЗИОННЫЙ Станок?

Как Сделать Настольный ЭЛЕКТРОЭРОЗИОННЫЙ Станок?

Jon Stewart Invites Panel of Trumps to Debate Iran War | The Daily Show

Jon Stewart Invites Panel of Trumps to Debate Iran War | The Daily Show

Pushing the frontier of neural text to speech

Pushing the frontier of neural text to speech

Как вылечить БЕЗ операций Близорукость,Дальнозоркость,Астигматизм,Косоглазие.Упражнения проф.Жданова

Как вылечить БЕЗ операций Близорукость,Дальнозоркость,Астигматизм,Косоглазие.Упражнения проф.Жданова

Скрытая марковская модель: концепции науки о данных

Скрытая марковская модель: концепции науки о данных

CROSS — Leveraging AI ASICs for Homomorphic Encryption

CROSS — Leveraging AI ASICs for Homomorphic Encryption

Лекция от легенды ИИ в Стэнфорде

Лекция от легенды ИИ в Стэнфорде

BMVA Symposium on Media Quality - Wei Zhou

BMVA Symposium on Media Quality - Wei Zhou

LSTM is dead. Long Live Transformers!

LSTM is dead. Long Live Transformers!

Вариационные автоэнкодеры

Вариационные автоэнкодеры

Prof. Simon King - Using Speech Synthesis to give Everyone their own Voice

Prof. Simon King - Using Speech Synthesis to give Everyone their own Voice

Stanford Seminar - Deep Speech: Scaling up end-to-end speech recognition

Stanford Seminar - Deep Speech: Scaling up end-to-end speech recognition

Как заговорить на любом языке? Главная ошибка 99% людей в изучении. Полиглот Дмитрий Петров.

Как заговорить на любом языке? Главная ошибка 99% людей в изучении. Полиглот Дмитрий Петров.

Как война в Иране превращается в Мировой экономический кризис? Каринэ Геворгян

Как война в Иране превращается в Мировой экономический кризис? Каринэ Геворгян

Test-Time Training Agents for Deep Exploration | Jonas Hübotter, ETH Zürich | BLISS e.V.

Test-Time Training Agents for Deep Exploration | Jonas Hübotter, ETH Zürich | BLISS e.V.

State-of-the-Art in Speech Technologies

State-of-the-Art in Speech Technologies