# Connectionist Temporal Classification (CTC)

alt text

Why we need CTC: RNNs are good for sequential aligned data. The problem is that in speech / handwriting, that is not avaialble.

It is tedious work to create aligned data; so we have CTC loss.

CTC is alignment-free. It works by summing over the probability of all possible alignments between the input and the label.

How it works:

1.) The CTC network assigns the character with the highest probability (or no character) to every input sequence.

2.) Repeats without a blank token in between get merged.

3.) and lastly, the blank token gets removed.

The CTC network can then give you the probability of a label for a given input, by summing over all the probabilities of the character for each time-step.

alt text

References: https://distill.pub/2017/ctc/

https://machinelearning-blog.com/2018/09/05/753/

← /nlp/vui/05_deep_network_solution.html /nlp/vui/07_project.html →