Closed captioning formats

CC logoAn online discussion led to my learning about Udemy’s support for closed captioning and to the formats available for it. Since I hadn’t heard about these formats before, I’m guessing a lot of other people haven’t. They can be useful not only for accessibility but for preservation, since they provide a textual version of spoken words in a video. These are just some notes on what I’ve found in a cursory investigation. In general, sites that support closed captioning expect a text file in one of several formats, which has to have at least the text of the caption, its starting time, and its duration or ending time.

YouTube supports several formats, including SubRip, SubViewer, MPSub, LRC, Videotrol Lambda, WebVTT, TTML, DFXP, Scenarist Closed Caption, EBU-STL, Caption Center, Captions, Inc., Cheetah, and NCI.

SubRip and SubViewer are similar formats that let you specify start and stop times in a way that’s easily entered by hand. Another closely related format is WebSRT.

WebVTT, short for Web Video Text Tracks, lives on the fringes of W3C. It’s “not a W3C Standard nor is it on the W3C Standards Track,” but it’s widely used. My impression from a quick reading is that it doesn’t have a terribly consistent syntax. It’s the only format Udemy supports.

The Timed Text Markup Language (TTML), known in earlier versions as DFXP or TTAF, looks more durable than WebVTT or WebSRT, being a W3C recommendation. Not all sources of captioned video support it, though. It’s XML-based, which makes it more verbose and harder to read, but more consistent, than WebVTT. It may be a good choice for archives.

SMPTE-TT is described as “a profile of TTML” which “defines some standard metadata terms to be used, and some extension features not found in TTML.”

A number of formats are intended for use with EIA-608, which is used for closed captioning on North American TV broadcasts. EIA-608 deals with how the captions are encoded in the broadcast signal, not with the format in a data file.

Captioning formats are an area I’ve barely looked at, so don’t take anything in this post as authoritative. Maybe someone will find these notes useful as a starting point for research.

