Data Transfer Project: New models for interoperability

In spite of improved file standardization, interoperability of data is often a challenge. Say you’ve got a collection of pictures on Photobucket and you want to move them to a different site. You’ve got a lot of manual work ahead. It would be great if there were a tool to do it all for you. The Data Transfer Project aims at making that possible. Some big names are behind it: Facebook, Google, Microsoft, and Twitter. The basic approach is straightforward:

The DTP is powered by an ecosystem of adapters (Adapters) that convert a range of proprietary formats into a small number of canonical formats (Data Models) useful for transferring data. This allows data transfer between any two providers using the provider’s existing authorization mechanism, and allows each provider to maintain control over the security of their service.

The domain name datatransferproject.dev may have caught your notice. What’s the .dev TLD? It turns out to be one that Google has reserved exclusively for its own use. I don’t know what it expects to gain from using an obscure and dubious looking TLD.

Data models and adapters

The heart of the project is a body of open-source code on GitHub. The DTP relies as much as possible on existing standards. It uses OAuth for authentication and REST for APIs.

For each file type that can be transferred, there’s a Data Model which consists of a file type and the metadata needed for importing it. A Data Adapter translates a provider’s APIs into Data Models. For each provider, there should be an import adapter and an export adapter.

The package org.datatransferproject.types.transfer.models gives an idea of the data models being pursued. They include:

  • CalendarModel
  • CalendarAttendeeModel
  • CalendarEventModel
  • ContactsModelWrapper
  • MailContainerModel
  • MailMessageModel
  • PhotoAlbum
  • PhotoModel
  • TaskModel
  • TaskListModel

Reciprocity

There’s an obvious problem here. Providers don’t like customers to leave, so they have an incentive to provide an import adapter but not an export adapter. The DTP recognizes this, but finding a solution isn’t easy. Providers are encouraged to take a “data portability pledge,” but living up to it is another matter. An independent agency that certified providers’ portability could be a big help. Even then, when people are choosing a provider to hold their information, they usually aren’t thinking hard about the eventuality of leaving.

The broader the participation is, the better the chances of success. Apple’s absence from the consortium is conspicuous. Many smaller providers will have to join for DTP-based interoperability to become the norm.

Offering users backup could be a reason for supporting export adapters. However, a provider-specific backup format is more useful. Any conversion to a generic format is likely to lose information along the way, so restored backups would be imperfect.

Data models as open standards

The best chance of success will come if the data models take on the status of open standards. If that happened, there would be a market for software that created collections directly in a data model, without exporting it from anywhere. Then people could maintain their collections themselves and move them to any site that supported an import adapter.

This may be too utopian. People generally place a greater value on convenience than control. Working directly and exclusively with a website is unquestionably easier.

I expect that sites which claim DTP export capability will do the least possible and claim they’re compliant. There isn’t much reason for them to do more.

Perhaps the project will succeed anyway. I hope it does.

Comments are closed.