Lose yourself for days following links in Charles W. Bailey’s Digital Curation and Preservation Bibliography.
Thanks to Jill Hurst-Wahl for the link.
Lose yourself for days following links in Charles W. Bailey’s Digital Curation and Preservation Bibliography.
Thanks to Jill Hurst-Wahl for the link.
The JHOVE2 team has announced a beta release:
This beta code release supports all the major technical objectives of the project, including a more sophisticated, modular architecture; signature-based file identification; policy-based assessment of objects; recursive characterization of objects comprising aggregate files and files arbitrarily nested in containers; and extensive configuration and reporting options. The release also continues to fill out the roster of supported formats, with modules for ICC color profiles, SGML, Shapefile, TIFF, UTF-8, WAVE, and XML.
The source code page provides the source as a Mercurial repository, or as a single download. The gzip download expands into a file called main-14e8a6102f63 and it isn’t at all obvious what to do with it. Chmoding it to an executable and running it doesn’t work. I’ve asked what this is supposed to be; I’ll update this post when I get a response.
Update: That’s a tarball. Adding the .tar extension and using tar -xvf works nicely.
PDF/A-2, according to a news item from Luratech, has been finalized and will be published as a standard in early 2011. (But see the comments.) Some more information (PDF) is available from the PDF/A Competence Center. PDF/A-1, which is based on PDF 1.4, will continue to remain a valid standard. PDF/A-2 is based on ISO 32000-1, aka PDF 1.7.
It’s a basic premise of the digital preservation community that preservation will require ongoing effort over the years. Let an archive lie neglected for twenty or thirty years, and you might as well throw it away. No one will know how to plug in that piece of hardware. If they do, it’ll have stopped working. If it still works, its files will be in some long-forgotten format.
The trouble is, this is an untenable requirement over the long run. Institutions disappear. Wars happen. Governments are replaced. Budgets get cut. Projects get dropped. Organizational interests change. The contents of an archive may be deemed heretical or politically inconvenient. The expectation that over a period of centuries, institutions will actively preserve any given archive is a shaky one.
Information from past centuries has survived not by active maintenance, but by luck and durability. Much of the oldest information we have was carved into stone walls and tablets. It lay forgotten for centuries, till someone went digging for it. There were issues with the data format, to be sure; people worked for decades to figure out hieroglyphics and cuneiform, and no one’s cracked Linear A yet. But at least we have the data.
Preservation of digital data over a comparable time span requires storage with similar longevity. This is a very difficult problem. If it’s hard to figure out writing from three thousand years ago, how will people three thousand years from now make any sense of a 21st century storage device? But we have advantages. Global communication means that information doesn’t stay hidden in one corner of the world, where it can be wiped out. Today’s major languages aren’t likely to be totally forgotten. As long as enough information is passed down through each generation to allow deciphering of our stone tablets, people in future centuries will be able to extract their information.
What we don’t have is the tablets. Our best digital media are intended to last for decades, not centuries. Archivists should be looking into technologies that can really last, that will be standardized so that the knowledge of how to read them stands a good chance of surviving.
My RSS feed for the Ten Thousand Year Blog went invalid, so I went searching and found it had moved to Blogspot. I really don’t know why anyone would voluntarily switch to Blogspot, but it’s David’s choice, and I suspect there’s a fair amount of overlap in the readership of our blogs (and if you’ve never read it, I suggest giving it a try), so spreading the word about the CoA is a reasonable thing to do.
Percy Willett has announced:
The JHOVE2 project team is holding a full day tutorial on the use of JHOVE2 on September 19, 2010, in conjunction with the iPRES 2010 conference in Vienna, Austria.
The main topics covered during the tutorial will be:
- The role of characterization in digital curation and preservation workflows.
- An overview of the JHOVE2 project: requirements, methodology, and deliverables.
- Demonstration of the JHOVE2 application.
- Architectural review of the JHOVE2 framework and Java APIs.
- Integration of JHOVE2 technology into existing or planned systems, services, and workflows.
- Third-party development of conformant JHOVE2 modules.
- Building and sustaining the JHOVE2 user community.
This tutorial is an updated and expanded version of the workshop presented at iPRES 2009 in San Francisco. This tutorial will closely follow the production release of JHOVE2 and will incorporate significant new material arising from the second year of project work.
The targeted audience for the tutorial includes digital curation, preservation, and repository managers, analysts, tool users and developers, and other practitioners and technologists whose work is dependent on an understanding of the format and pertinent characteristics of digital assets.
For more information on JHOVE2, see the project wiki at: http://jhove2.org
For more information on iPRES 2010, and to register for the workshop and conference, see the conference website: http://www.ifs.tuwien.ac.at/dp/ipres2010/
Comments Off on JHOVE2 tutorial at iPRES 2010
Posted in News
Tagged conferences, JHOVE, preservation
Seeing Standards, the result of a project supported by the Indiana University Libraries, provides a visual arrangement of metadata standards used with cultural heritage work. There are lots of relevant standards! Jenn Riley, Metadata Librarian in the Indiana University Digital Library Program, developed the content. Design work was performed by Devin Becker of the Indiana University School of Library and Information Science.
There is a poll online for letting the developers of JHOVE2 know what plans you have for it. It just takes a couple of minutes to fill out and doesn’t even require Javascript.
LOC irony
The Library of Congress Digital Preservation Newsletter (latest issue, subscription page) has some very nice content, but it’s ironic that the newsletter is delivered with the nondescript file name of
201101.pdfand that (if JHOVE is right) it doesn’t conform to PDF-A. A PDF/A document can’t have external links, so its lack is excusable; it’s the meaningless file name that actually bugs me more from a preservation standpoint.I can’t find an editorial contact address on the newsletter to mention this to.
2 Comments
Posted in commentary
Tagged libraries, PDF, preservation