JHOVE online hack week

Open Preservation Foundation has scheduled an online hack week for JHOVE. The focus for this one will be on development. Another hack week is planned for September, focusing on documentation. JHOVE just keeps going and going, and this is a chance for volunteer Java developers to reduce its issue list.
JHOVE logo

PDF/A-4

It looks as if I’ll have a little input into the upcoming PDF/A-4 standardization process; earlier this month I got an email from the 3D PDF Consortium inviting me to participate, and I responded affirmatively. While waiting for whatever happens next, I should figure out what PDF/A-4 is all about.

ISO has a placeholder for it, where it’s also called “PDF/A-NEXT.” There’s some substantive information on PDFlib. What’s interesting right at the start is that it will build on PDF/A-2, not PDF/A-3. A lot of people in the library and archiving communities thought A-3 jumped the shark when it allowed any kind of attachments without limitation. It’s impossible to establish a document’s archival suitability if it has opaque content.
Continue reading

Files that Last: 50% off!

Read an Ebook Week I’m participating in Smashwords’ “Read an Ebook Week Sale,” from March 3 to March 9, 2019. During that time, Files that Last will be available for 50% off! Don’t miss your chance to learn about “digital preservation for everygeek” at a low price.

Path traversal bugs in archive formats

Malware has shown up which takes advantage of a path traversal bug in the WinRAR archiving utility. The bug, which reportedly existed for 19 years, is fixed in the latest version. The problem stems from an old, buggy DLL which WinRAR used. It allowed the expansion of an archive with a file that would be extracted to an absolute path rather than the destination folder. In this case, the path was the system startup folder. The next time the computer was rebooted, it would run the malware file.
Continue reading

Nothing to add

Today’s XKCD is so relevant to this blog that all I can do is link to it without further comment.


.NORM Normal File Format

What part of “No Flash” doesn’t Microsoft understand?

If you disable Flash on Microsoft Edge, Microsoft ignores your setting — but only for Facebook’s domains. It sounds too conspiratorial to be true, but a number of generally reliable websites confirm it.

Bleeping Computer: “Microsoft’s Edge web browser comes with a hidden whitelist file designed to allow Facebook to circumvent the built-in click-to-play security policy to autorun Flash content without having to ask for user consent.”

ZDNet: “Microsoft’s Edge browser contains a secret whitelist that lets Facebook run Adobe Flash code behind users’ backs. The whitelist allows Facebook Flash content to bypass Edge security features such as the click-to-play policy that normally prevents websites from running Flash code without user approval beforehand.”
Continue reading

Preserving Google+ content

I recently got an email reminding me that my Google+ account will go away on April 2. My first reaction was a yawn. Google has made the service steadily less attractive over the years. I just checked my feed for the first time in months, and it consists entirely of posts by people I don’t follow, on topics I don’t care about. Posts from this blog and my writing blog get links automatically posted to Google+, but otherwise I haven’t posted in a long time.

One of my posts got two comments from people I know, so it’s not totally dead, but it’s close. Google made the service as unattractive as they could. Posts by strangers keep showing up. Comments appear and disappear as you’re trying to read them. But there was a time when Google+ was somewhat useful. You might have material there which you want to save. Fortunately, Google provides a way to do this.
Continue reading

A screen capture tip using Grab on the Mac

MacOS provides a few different ways to do screen captures. My personal favorite is Grab, which is found in the Applications/Utilities folder. It lets me capture a selection, a window, or the whole screen without having to remember any magic key combinations. I keep it in the Dock for quick access.

Grab has one deficiency, though. It can save screenshots only as TIFF files. If Apple had to pick just one format, that’s hardly the most useful one. But there’s an easy workaround.

After you’ve got your screen shot, press Command-C or choose “Copy” from the Edit menu. Open the Preview application. Press Command-N or select “New from clipboard” from the File menu. You now have the screenshot in Preview.

In Preview, press Command-S or choose “Save…” from the File menu. You’ll get a dialog to save the file, with a choice of formats: JPEG, JPEG2000, OpenEXR, PDF, PNG, or TIFF. Pick whichever one you like. If you’re going to put the image into a Web page, PNG is usually the best choice. Preview will remember your choice for next time. Then save the file.

If you prefer, you can do the equivalent in Photoshop, Gimp, or any other image-processing application, but Preview has the advantage of launching quickly and keeping the process simple.

That’s it. You can now use Grab to save screenshots to a Web-friendly format.

The police body camera data problem

The Washington Post reports that some police departments are dropping body camera programs because of the expense. I’ll admit that my first gut reaction on seeing the story was that it’s just an excuse. In some cases it probably is. But it’s a fact that while the cameras are cheap, storing and managing large amounts of video data isn’t. The question needs objective examination.
Continue reading

Canvas fingerprinting in Web pages

The array of sneaky tricks to get past Internet users’ veil of privacy is astonishing. At least it would be, if we weren’t all past the capacity for astonishment. One which has been around for years is Canvas fingerprinting. It lets servers narrow your profile down to a small number of clients. Combined with other measures, it can uniquely identify you.

How Canvas works

Canvas wasn’t designed to spy on you. It’s a way to draw graphics very efficiently in a browser. It supports animation and interaction. In order to get fast performance, it allows hardware acceleration and doesn’t mandate the exact set of pixels to be drawn. The server can then get those pixels back using getImageData() or toDataURL() in the Canvas API.
Continue reading