How to Identify “Sentences” in Transcripts📝?

Angelina Yang
3 min readFeb 8, 2024

Have you ever noticed that YouTube transcripts doesn’t have the conventional concept of “sentences”?

Transcripts generated from live video or audio lack prior knowledge of intended scripts with punctuations. As a result, there’s no automatic delineation of “sentences” marked by punctuations.

The challenge arises when dealing with lengthy text transcripts. Imagine downloading transcripts from a video or audio without any punctuations. Wouldn’t reading a file like that make you dizzy?