The presentation suggested this would work for basic content protection, but not all. For example, some documents are protected by Adobe’s “LiveCycle Rights Management.” It would need removed or the PDF will expire after a certain number of days, so a person might want to follow this removal guide to spoof the server and remove content protection.
Watermark excision: Briss was suggested as a way to crop pages to remove watermarks, but “Briss performs what is called a non-destructive crop,” meaning the watermark is resized but not deleted…as in a forensic expert could still retrieve the watermark. Therefore the next step is to use PDF Creator to print the page with new margins that do not include the resized watermark. Slide 29 describes other potential watermarks and how to neutralize them.
Metadata spoofing: Metadata for a PDF might include the author’s name, a timestamp when the document was created, time zone and “the ever-mysterious UUID” (Universally Unique Identifier). Harding said, “Adobe has a built-in metadata scrub tool, but don’t you trust it!” Adobe’s checkbox to “discard document information and metadata” was referred to as a “lie.”
After running Adobe’s tool, then open the PDF in some flavor of hex editor to “modify the timestamps and the Document/InstanceID UUID fields.” Harding delved into additional complexity, but “if the aim isn’t spoofing but simple removal, automated tools” such as Metadata Anonymization Toolkit “are readily available.”
Time to share the knowledge!
Whether or not you agree with Harding and jumping paywalls to score academic documents, his presentation seemed fascinating. Harding said, "Stay safe. Eat the publishers before they eat us. Remember...we are at war."
Sign up for Computerworld eNewsletters.