Connectionist Temporal Classification (CTC) From Scratch
Автор: Priyam Mazumdar
Загружено: 2025-08-01
Просмотров: 2053
Описание:
Code: https://github.com/priyammaz/PyTorch-...
Great Article about CTC: https://distill.pub/2017/ctc/
All Credit for Code: https://github.com/vadimkantorov/ctc/...
Today we will implement a crucial part of many Automatic Speech Recognizers, the CTC Loss. The problem with speech recognition is there is no one-to-one relationship between the audio and its corresponding transcript. This means we have to learn the alignment of our text to that audio, and this is exactly what CTC does. It leverages dynamic programming (forward algorithm) to find the total probability of all possible alignments to then learn the best one! We explore today both a little bit of the theory in CTC and then make a full PyTorch implementation for it!
Timestamps:
00:00:00 - Introduction
00:01:08 - Why do we use CTC?
00:11:06 - Dynamic Programming for Efficiency
00:14:43 - Some Rules for Transitions
00:25:35 - Start CTC Implementation
00:32:29 - Setup of t_a_r_g_e_t_s
00:37:11 - Check for Valid Transitions
00:44:19 - Gather Log Probs for Targets
00:51:39 - Initialize Log Alphas
01:02:34 - Dynamic Programming
01:15:30 - Aggregate Valid End Tokens
01:24:25 - Compare to PyTorch CTC
Socials!
X / data_adventurer
Instagram / nixielights
Linkedin / priyammaz
Discord / discord
🚀 Github: https://github.com/priyammaz
🌐 Website: https://www.priyammazumdar.com/
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: