Tag Archives: email

The curse of HTML mail

It’s been most of a year since I last posted here, but I wanted to rant about HTML mail, and this is the right blog for it. People complain about the intrusiveness of Web tracking, but email tracking is even worse. I’ve noticed this especially after subscribing to a couple of Substack newsletters. They’re sent as HTML, and whenever possible, I click the link to the equivalent Web page, which is less intrusive. Every link in a Substack newsletter is a tracking link, with the odd exception of the link to the Substack page.

The links in a Substack newsletter don’t go to the target page but to a Substack redirection URL. Their purpose is to let Substack know about everything you click on. There are no terms or privacy policy in the email telling you what Substack uses the information for.

Continue reading

Technical issues with the Hunter Biden email

The PDF Association has an analysis of the file which the New York Post has uploaded to Scribd, which purports to show a message from Vadim Pozharskyi to Hunter Biden and Devon Archer. Discussions of what it signifies politically and whether Twitter was justified in blocking the link are for another place. The issue in this blog is what the file says about the authenticity of the email. The answer is: Nothing at all.

Continue reading

HTML mail is a terrible idea — but at least please do it right

Originally email consisted just of text messages. They were straightforward to read. It was very hard to send malware in a convincing way, since the recipient would have to extract any malicious attachment and run it by hand. There was a hoax in 1994 warning of the alleged “Goodtimes virus”, which caused a lot of merriment among the computer-literate. The only “virus” was the hoax email itself, which the less computer-literate forwarded to all their friends.

Then came HTML mail, a huge advance in email insecurity. Now malicious URLs could hide behind links or even be opened automatically. It could include JavaScript to exploit client weaknesses and trick recipients. Today, almost everyone recognizes these advantages, and malware and phishing by email are multi-billion-dollar businesses.

Doing it right, or not doing it at all

Even so, there are good and bad ways to create HTML mail. Continue reading

Apple hides attachments in malformed multipart mail

Recently I got a PDF of a filk songbook which I had contributed to. More precisely, the email said I was getting it, but there was no sign of an attachment. I wrote back to the editor who’d sent it, and she insisted it was there. Digging it out of the message revealed to me a whole new way of messing up email formats.

A quick look at the message source showed that there really was an attachment with Content-Type of “application/pdf” which took up well over 90% of the message. The question was why Thunderbird didn’t show it to me.
Continue reading

The email jungle

In researching tomorrow’s post on email preservation on Files That Last, I came to appreciate more thoroughly how messy email formats are. RFC 4155, which defines “the ‘default’ mbox database format” (their quotes around “default”) and application/mbox MIME type, tells us that “The mbox database format is not documented in an authoritative specification, but instead exists as a well-known output format that is anecdotally documented, or which is only authoritatively documented for a specific platform or tool.”

Some versions may have eight-bit character data with the character encoding not explicitly specified, and possibly varying from one file creator to another. The format of email addresses isn’t specified. A short page on qmail.org, referenced from RFC 4155, discusses some of the variants, including mboxo, mboxrd, mboxc1, and mboxc12. The differences may appear minor, but they’re sufficient that a parser that assumes one of the variants can fail when it encounters the others.

Then there’s the encoding issue. Most of the world has settled on MIME by now, but older archives (and perhaps some recent ones) may contain messages encoded with uuencode, BinHex, or Apple Single. The last two are found mostly with mail that was sent from Macintosh clients, but uuencode was once widely used — and poorly standardized.

An alternative email archiving format is the CERP XML schema. This looks at a glance as if it provides better structuring than MBOX, but it isn’t as widely supported.

Update: The FTL post is now available at “You HAD mail.”