The Library of Congress has reorganized its site on file format sustainability and given it a new URL. (The old one redirects there.) A blog entry discusses the change. Relationships among formats are a big part of the site. It’s significant, for instance, that the MP3 encoding and the de facto MP3 file format get separate entries.
My reactions are mixed. When you click “Format Descriptions” on the main page, you get a page titled “Format Description Categories.” The nesting description at the top says you’re in “Format Descriptions as XML.” Eight categories are listed, and two formats plus “All xxx format descriptions” are listed under each category. There’s no obvious reason why those two formats get special prominence, or what the page has to do with XML.
My venture into the Techno-Liberty blog didn’t work so well. In fact, I’m getting more views on this blog, in spite of not having posted in months, than I got on my best days on the other blog. So … I’m back.
JHOVE is still doing well too, thanks to excellent work by Carl Wilson and others at the Open Preservation Foundation. There will be an online hack day for JHOVE on April 27. The aim is to find ways to improve JHOVE by improving error reporting, collecting example files, and documenting the preservation impact of JHOVE validation issues. (I think that last one means “Why does McGath’s PDF module suck?” :)
The time listed is 8 AM-8 PM. I asked what time zone that is, and was told it means any and all, from New Zealand the long way around to Hawaii.
Last time I said I’d drop in and didn’t really manage to. This time I won’t make promises, but I’ll try to be around in some form. If nothing else, people can ask me questions about JHOVE in the comments.
Posted in News
Tagged JHOVE, software
You may have noticed this blog has been less active for a while. It’s several years since I’ve been actively involved in digital preservation, apart from a PNG module for JHOVE. File formats are still a special love of mine, but I’m moving on to a new blog, reflecting more urgent concerns. This blog is called Techno-Liberty. It’s about the tools for staying free through open communication, privacy, and new technologies.
This blog will stay around as long as WordPress doesn’t purge it, but new posts may be rare. I want to put as much effort as I can into making Techno-Liberty an interesting blog with a steady stream of substantial content. I hope many of you will find it worth following.
Not every report of malware in an image file is spurious. A report of malware smuggled through a PNG file looks plausible to me. It claims that for two years, criminals got malware undetected onto respectable sites with this technique. Ironically, the ads included ones for a so-called “Browser Defence.”
Unlike Check Point’s “Imagegate,” this report doesn’t claim the image alone can do anything, and it describes the technique in considerable detail. Check Point said it would give specifics about “Imagegate,” like what format is affected, “only after the remediation of the vulnerability in the major affected websites.” It’s still waiting, apparently.
The PNG exploit is impressively sneaky. A script which doesn’t trigger alarms checks the host browser’s defenses. If it finds a vulnerable target, it loads a PNG file whose alpha channel encodes the malware script, then decodes the script and runs it. The actual malware takes advantage of — wouldn’t you know it? — Flash vulnerabilities. The user doesn’t have to do anything except view the page to be victimized.
Attacks like this are why ad blockers have become so popular.
Websites that allow third-party posting should disallow or filter SVG content. WordPress disallows SVG uploads by default.
SVG is a designed-in danger in HTML5.
Search for “imagegate,” and you’ll find lots of articles claiming there’s a malware risk in JPEG files. Look more closely, and you’ll notice they don’t provide any support for the claim. They all take an article from Check Point as their source, but there are two little problems with that: (1) The article doesn’t blame JPEG files, and (2) as I noted in my last post, it’s inept reporting.
Looking very closely at the video which accompanies the Check Point article, I see that it shows a file called “payload.jpg” being uploaded. This must be what all these sites are going by. You have to look really close to see the blurry file name coming up, and I give these sites credit for examining it more closely than I did.
If this was Check Point’s way of tipping people off that the weakness is in JPEG, it’s a strange way to do it. Did they think that ordinary users would catch it, but malware authors would be tricked by the article’s reference to non-image formats like JS and HTA as “images”?
None of the articles I’ve seen question why Check Point tipped them off in this way or note the inconsistencies in the article and the video. None of them ask why Check Point refuses to give any information until “the major affected websites” fix the problem, when a format vulnerability impacts any software that reads the files.
Doesn’t anyone know how to do journalism any more?
Update: Facebook says that “Imagegate” is bull.
Lately this blog hasn’t been showing up on Google. It’s unfortunately necessary to convince Google I’m real, so I’ve added a confirmation meta tag and linked to this blog from a Google Page. As an extra advantage, you’ll be able to read my posts from Google, if you’re so inclined.
This is also a test post to see if that’s working. I’ll post about something more interesting soon.
Reporting carries responsibility. When you tell the public about a risk, you need to tell them what the risk is, not just scare them. An article from Check Point Software Technologies, titled “ImageGate,” shows how bad even tech sites can get at clickbait reporting. According to Wikipedia, Check Point is a business with thousands of employees, not a hole-in-the-wall IT company that hires ghostwriters to write filler.
The article claims:
the attackers have built a new capability to embed malicious code into an image file and successfully upload it to the social media website. The attackers exploit a misconfiguration on the social media infrastructure to deliberately force their victims to download the image file. This results in infection of the users’ device as soon as the end-user clicks on the downloaded file.
Malicious SVG images sent over Facebook Messenger are being used to deliver Locky ransomware.
An SVG file can contain a
HTML 5.1 is now a W3C proposed recommendation, and the comment period has closed. If no major issues have turned up, it may become a recommendation soon, susperseding HTML 5.0.
Browsers already support a large part of what it includes, so a discussion of its “new” features will cover ones that people already thought were a part of HTML5. The implementations of HTML are usually ahead of the official documents, with heavy reliance on working drafts in spite of all the disclaimers. Things like the
picture element are already familiar, even though they aren’t in the 5.0 specification.