Audio and video in HTML5

I’ve been studying up on streaming audio and video and related issues, so lately I’ve been playing with the <audio> and <video> tags in HTML5. It’s possible to put them to good use, but there are more issues than their proponents will readily admit.

A good piece of news is that both tags do exactly the same thing except for their appearance. You can play video with the audio tag and vice versa, and they implement the same DOM model. (Of course, you won’t see anything interesting if you use <audio> for video.)

The main limitation is that these tags support only progressive streaming, which differs from “true” streaming in some important ways. Progressive streaming means downloading a file and starting to play it almost immediately, rather than after it’s finished downloading. Its disadvantages are that the bit rate can’t be adjusted while playing, you can’t keep the file from being grabbed in its entirety with a simple HTTP call, and the download continues to completion even if the user pauses the player. These aren’t always significant problems, but they mean that the new HTML5 tags aren’t the full replacement for Flash which they’re sometimes claimed to be.

There’s enough interest in true streaming that various parties have developed protocols to do it over HTTP. These include HTTP Live Streaming from Apple, HTTP Dynamic Streaming from Adobe, Smooth Streaming from Microsoft, and Dynamic Streaming Over HTTP from MPEG (which its proponents insist isn’t a protocol). There are more details on streaming on my website.

The other problem with the HTTP tags is that there’s no one encoding that all major browsers support. This problem is well known on the video side, but I was surprised to discover it’s even true for audio. The current version of Firefox doesn’t natively support MP3 in the audio tag, and the QuickTime plugin isn’t used in this case (or at least I can’t get it to work). The reason for this is software patents. There’s a good discussion of the state of MP3 with Firefox on Stack Overflow.

You can specify several <source> elements within an audio or video element, and the browser will try each one in turn till it finds one it can play. Two formats or at most three will cover all major browsers. For audio, including both an MP3 and an Ogg Vorbis version should cover all the bases; for video, MP4/H.264 and Ogg Theora should do it, though you might want to add WebM.

Specifying the type attribute as the MIME type of the file (e.g., <source src="song.mp3"
helps the page to load faster, since the browser can determine without examining the file if it can play the file in principle. Make sure, however, to use only the canonical MIME types. From experimentation with various browsers, these include:

  • audio/mp4
  • audio/mpeg
  • audio/ogg
  • video/mp4
  • video/ogg
  • video/webm

If you specify application/mp3 rather than audio/mpeg for an MP3 source, the browser may decide it can’t play it even though it really can.

Another issue is the AV API for HTML5. There’s a pretty decent DOM API to go with the audio and video tags, allowing you to override the player controls and dynamically change content. Some implementations (e.g., Mozilla’s version) have added private extensions. Some people want more power, so there are third-party plugins and JavaScript libraries such as MediaElement.js that extend the API.

It’s a minefield, except that the mishaps come from the absence of an earth-shattering kaboom. Still, using the HTML5 tags is much simpler than Flash or HTTP streaming.

Comments are closed.