Tag Archives: Harvard


Secrets of the online Harvard libraries

Here’s a new video on viewing publicly available information in the Harvard Library’s Digital Collections, Harvard Geospatial Library (HGL), and Visual Information Access (VIA).
Continue reading

FITS website

Last spring, I attended a Hackathon at the University of Leeds, which resulted in my getting a SPRUCE Grant for a month’s work enhancing FITS, a tool which at the time was technically open source but which the Harvard Library treated a bit possessively. After I finished, it seemed for a while that nothing was happening with my work, but it was just a matter of being patient enough. Collaboration between Harvard and the Open Planets Foundation has resulted in a more genuinely open FITS, which now has its own website. There’s also a GitHub repository with five contributors, none of which are me since my work was on an earlier repository that was incorporated into this one.

It really makes me happy to see my work reach this kind of fruition, even if I’m so busy on other things now that I don’t have time to participate.

The FITS Blitz

Back in May, after an enjoyable trip to the University of Leeds, I worked for a month on improving the Harvard Library’s FITS tool for combining the results of several file format identification and validation tools. The results were well received and the Harvard Library incorporated some of my work in the main line of FITS. Still, there were a lot of loose ends left and more work to be done.

Things are picking up again with a “FITS Blitz” that’s starting this week. Paul Wheatley writes that “in partnership with Harvard and the Open Planets Foundation (with support from Creative Pragmatics), SPRUCE is supporting a two week project to get the technical infrastructure in place to make FITS genuinely maintainable by the community. ‘FITS Blitz’ will merge the existing code branches and establish a comprehensive testing setup so that further code developments only find their way in when there is confidence that other bits of functionality haven’t been damaged by the changes.”

I’ve moved on to other things, so I won’t be able to participate, but I wish them every success.

Who’s using FITS?

It would be helpful for me to have at least a partial list of institutions that are using Harvard’s FITS (File Information Tool Set). If you can help me build this list, could you reply here or contact me by other usual channels? Thanks.

JHOVE 1.7, finally!

After well over a year, a new version of JHOVE is finally available. Really, not very much has changed since 1.6 as far as the software itself goes. However, I’m leaving Harvard at the end of August and asked for and got custody of JHOVE, so this version marks its transition from a Harvard-supported project (which, in practice, it hasn’t been for a long time) to a separate open-source project. The JHOVE web pages are now hosted on SourceForge, and all support and discussion will go through SourceForge. The jhove-support and jhove-users mailing lists hosted by Harvard will shut down in the near future.

This doesn’t mean JHOVE is dead. I may actually have more opportunities to work on it than before, now that I’m going into independent consulting. I need to stay visible to the library and preservation world, and this is one way to do it.

Meanwhile, I’m looking for contract opportunities. Please take a look at my new business site or my LinkedIn profile.

JHOVE web pages moved

The web pages for JHOVE are now on SourceForge. They’ll remain on the Harvard site for some period of time but won’t be further updated.

There’s at least a chance this means there will be a release of JHOVE soon. Yes, I know, I’ve been promising that for a long time.

Correcting Harvard Library rumors

In spite of rumors that have shown up in the #hlth feed on Twitter, no one at the Harvard Library was laid off yesterday, let alone “everybody.” We were told, however, that there will be cutbacks.

We were told that we should all fill out “employee profiles” online to aid in determining what future career we’d have, if any, at Harvard. An official pronouncement quoted in Library Journal has denied that we will all have to “reapply” for our positions, but many of us find the distinction subtle even if it’s technically true.

Take a look at this post for a good summmary.

Further update: Here’s a transcript of yesterday’s presentation at Harvard. There is one significant discrepancy between the transcript and what I and others recall: Helen Shenton did not say at the 9 AM meeting that the deadline for employee profiles was February 29. The deadline was initially earlier — mid-February, I think — and was changed to February 29 by the end of the meeting, following numerous expressions of concern from the audience. (She may have said February 29 at the later meetings.)

Closed access at Harvard

Sorry about the off-topic post, but this is the best channel I have for reaching the academic world.

Whatever Robert Frost may have said, something there is that really loves a wall. Specifically, fear does. The fear that looks askance at every foreign-looking person, that puts fortifications on our borders, that sees only the danger in contact from others.

Locked gate at Harvard Yard

Locked gate at Harvard Yard

A small, non-violent (with perhaps an exception or two) mob assailed Harvard Yard last Thursday night, and Harvard gave in to fear. The gates were shut or put under guard for the night, which may well have been necessary. They’ve remained that way ever since. To get into Harvard Yard, you must show an ID or have an invitation. Today employees received an email giving the weekday and weekend schedules for the gates, suggesting this won’t go away quickly.

This is inconvenient for Harvard people and more so for others who have reason to visit. The tours of Harvard Yard are on hiatus. If you have an appointment or a conference, your host has to provide a list of the people attending so they can be allowed in. Lamont Library contains a repository of government records which is open to the public without an ID — but you can’t get to Lamont.

I don’t know how long this will go on. When vague fears drive a policy and no risk is too small to ignore, there’s no reason ever to stop.

HTML5 security

Yesterday, February 24, Ming Chow gave a talk to the ABCD security group at Harvard on HTML5 security. As far as I can tell he hasn’t made any of the content publicly available online, but here are some high points:

  • HTML5 has a lot of new features, giving it a bigger “attack surface.”
  • There’s no effective security to local and session storage, so writing sensitive information there is a bad idea.
  • The database feature raises all the standard concerns about injection of malicious SQL code into fields.
  • Application caches can be written by any website. It may be possible to spoof pages this way.
  • There is now a function, XDomainRequest, in JavaScript, which allows communication between different sites. The receiver of the request must specify Access-Control-Allow-Origin to indicate whose requests are allowed. Wild-carding this allows anyone at all to send data to a page, which may be dangerous. Implementers of a receiver should always verify the sender’s identity.
  • With the audio, video, and canvas tags, the codecs can be vulnerable. Opera has been hit with a heap buffer overflow exploit in HTML5.
  • The noscript tag is no longer supported. Users who try to make themselves safer by disabling Javascript are more screwed than ever.
  • The problems are new, but the approach to safety is the same: common sense, input validation, being careful with unsecured connections, etc.

FITS user guide

There’s now a user guide online for Harvard University Libraries’ File Information Tool Set (FITS). FITS extracts technical metadata using serveral different tools, including JHOVE, Exiftool, NLNZ Metadata Extractor, DROID, FFIdent, and File Utility.

Does anyone reading this know if FFIdent is still alive somewhere on the Web? A web search for it turns up nothing useful, and the number 1 hit is the FITS site itself.