File corruption and political corruption

When people who don’t understand file formats manipulate files in order to cover their tracks, they generally fail miserably. Slate magazine gives an entertaining case in point from the Trump scandals. The article says:

There are two types of people in this world: those who know how to convert PDFs into Word documents and those who are indicted for money laundering. Former Trump campaign chairman Paul Manafort is the second kind of person.

The PDF Association chimes in with additional technical details.

According to the indictment:

  • In October 2016, Manafort emailed Rick Gates a PDF file with a profit and loss statement for Davis Manafort Inc. It showed a loss of over $600,000.
  • Gates converted the PDF to a Word document and emailed the latter to Manafort. The quotation doesn’t say what version.
  • Manafort altered the Word document to show a profit of over $3.5 million and sent it to Gates.
  • Gates converted the Word file to PDF and sent it (with the falsified information) to a lender.

It was the email records, not the clumsiness of the conversion, which the prosecutors caught. But Manafort apparently couldn’t figure out how to alter a PDF file himself. As the PDF Association notes, going back and forth between Word and PDF is bound to alter the appearance of the file. Whether it would change it enough to make the tampering obvious is another question.

It wasn’t even necessary to do the back-and-forth; lots of tools are available to change the text in a PDF. But then, even converting a Word file to PDF was apparently beyond Manafort’s technical skills.

Hiding PDF alterations isn’t easy

The problems run deeper. A PDF file normally contains metadata indicating when it was created and last modified. It isn’t too hard to alter these, but given the clumsiness of the process, it’s unlikely that either Manafort or Gates knew how. The date on the PDF would have been after the supposed release date of the P&L statement, and an expert witness should be able to catch this in a file analysis.

There’s a long tradition of clumsy alteration and redaction of PDF documents. A favorite trick is to superimpose a black square over the text to be hidden. The square can be removed as easily as it was added, and the “hidden” material is still there. Even dumber is changing the color of the text to match the background. To read it, you don’t even need any file analysis tools; just select the “invisible” text with the mouse.

Even “proper” editing of PDF documents doesn’t necessarily delete the information. Changes to a PDF file may consist of appending to the file to replace existing objects, without deleting the earlier version. Changing a text object in place to make it longer requires pushing everything after it down in the file. The PDF 1.7 spec says:

Applications may allow users to modify PDF documents. Users should not have to wait for the entire file — which can contain hundreds of pages or more — to be rewritten each time modifications to the document are saved. PDF allows modifications to be appended to a file, leaving the original data intact.

If the software modifies the file this way, the original information, such as the $600,000 loss, is still in the file, and software tools can not only find it but pinpoint the alteration.

Ensuring PDF integrity

Normally, though, investigators don’t look at the innards of a file unless they have reason for suspicion. The lender accepted the modified statement, apparently without question, even though its internal timestamps may have been a giveaway.

It would have been better if Manafort’s accountant had created a digitally signed PDF of the P&L statement. That would have made it impossible to alter without invalidating the signature. I used digital signing the last time I bought a home; it shouldn’t be out of the reach of people who are dealing in millions of dollars.

When you can’t trust the integrity of people, sometimes you can still trust the integrity of a file.

One response to “File corruption and political corruption

  1. Interesting. I never knew that about PDFs.