- Gary McGath, Freelance Technical Writer
Are you looking for expert, reliable writing on computer technology? Drop me a note. - Follow Mad File Format Science on WordPress.com
-
Recent Posts
RSS feed
Stuck with learning at home while the schools are closed? I’ve created a page with lots of links to help you learn the basics of file formats on your own!
Files that Last: Digital Preservation for Everygeek, an e-book to bring the message of digital preservation to the broader geek world.
Identifying files by programming language
Most of today’s programming languages look vaguely similar. They’re derived from the C syntax, with similar ways of expressing assignments, arithmetic, conditionals, nested expressions, and groups of statements. If the files have their original extension and it’s accurate, format identification software should be able to classify them correctly.
The software should do some basic checks to make sure it wasn’t handed a binary file with a false extension, which could be dangerous. A code file should be a text file. regardless of the language. (This isn’t strictly true, but non-text languages like Piet and Velato are just obscure for the sake of obscurity.) The UK National Archive recognizes XML and JSON (which is a subset of JavaScript) but doesn’t talk about programming languages as file formats. Exiftool identifies lots of formats but makes no attempt to discern programming languages.
Continue reading →
1 Comment
Posted in commentary
Tagged archiving, file identification, software