Sometimes when you click on a link to a PDF, it comes up in the browser. Other times, the browser downloads the file. Everyone must wonder why, but few have wondered enough to find out. Here’s a quick explanation.
It has nothing to do with the PDF version, the content of the file, or the link. It’s the HTTP headers that make the difference. Specifically, a header called “Content-Disposition” is the determining factor. If it’s absent, the file will open in the browser. If it’s present, the value it specifies determines how you get the file.
IETF RFC 6266 specifies the header’s behavior. Using “Content-Disposition” not only directs how to handle the file, it allows giving it a filename which is different from the file on the server.
In the simplest form, it gives just the disposition type, which can be either “attachment” or “inline.” “Inline” is the default behavior. By specifying “attachment,” this header will force downloading:
The header name and value are both case insensitive. The header can specify a name for the file:
Implementers have to be careful when handling the file parameter. If the filename contains path separators, the server is clearly up to no good.
The specification requires clients never to change the download directory from the default. Clients should be wary of the file extension when downloading the file, especially if it disagrees with the MIME type in the headers.
By default, the filename is encoded in ISO 8859-1. Specifying other characters in the filename is a little more complicated. The server can use the “filename*” parameter to specify UTF-8 or ISO 8859-1 encoding, as described in RFC 8187. Not all clients support this parameter, so the header will usually include a “filename” parameter as a fallback. Section 4.3 of RFC 6266 discusses precautions to take with names.
If you were just wondering why you sometimes have to download a PDF, that may be more information than you wanted, but this wouldn’t be Mad File Format Science without TMI. In any case, now you know.