PDF documents remain one of the most common malicious document delivery vectors in phishing campaigns plus targeted attacks. Modern PDF features (embedded JavaScript, action triggers, URI launches, embedded files, form actions) provide many attack surfaces a suspect PDF may exploit. The Sherlock PDF Editor surfaces these embedded elements for forensic analysis without executing the document. This guide covers what suspect-PDF forensic analysis reveals, common malicious PDF patterns plus the analysis workflow for incident response investigation.
Why PDF remains a preferred attack delivery format
PDF documents have characteristics that make them attractive to attackers:
Universal recipient trust. Users open PDF documents from email senders, websites plus messaging platforms without the suspicion that accompanies executable file types. Email gateway warnings rarely flag PDF attachments specifically.
Rich feature surface. Modern PDF includes JavaScript execution, embedded files, action triggers, form actions, URI launches plus document signing. Each feature creates an attack surface a malicious document can exploit.
Adobe Reader plus alternative reader differences. Different PDF readers handle features differently. A PDF that appears benign in one reader may execute malicious behavior in another. Attackers target the readers their victims use.
Persistence across deletion attempts. Users sometimes save PDF attachments locally before opening. The saved file persists in download folders plus email client cache directories. Even after the user deletes the email, the file survives on disk.
Common in business correspondence. Invoices, contracts, statements, reports plus other business documents are routinely sent as PDFs. The volume of legitimate PDF traffic provides cover for malicious documents.
Common malicious PDF patterns
Suspect PDF documents typically exhibit one or more of these technical patterns:
Embedded JavaScript. PDFs can contain JavaScript that executes when the document opens. Malicious JavaScript may attempt to exploit reader vulnerabilities, download additional payloads, exfiltrate browser cookies via JavaScript-XHR or modify document content based on the viewer environment.
OpenAction triggers. The PDF specification allows actions to fire when the document opens. Malicious OpenAction handlers may launch URI calls, execute embedded JavaScript or open embedded files.
URI launch actions. Malicious PDFs may launch arbitrary URIs when the user clicks specific elements or when the document opens. URI launches can target malicious websites for credential phishing or exploit kit landing pages.
Embedded files. PDFs can embed other files within the document. Malicious embeddings may include executable payloads that drop to disk via embedded-file actions.
Form actions. PDF forms can include submit actions that POST to attacker-controlled servers. Malicious forms may collect credentials or other input under the guise of legitimate document interaction.
Adobe Reader CVE exploitation. Specific CVE exploits in Adobe Reader (or alternative readers) target known vulnerabilities. The malicious document carries the exploit payload plus the trigger.
Steganographic content. Some malicious PDFs hide payloads in document streams that appear to contain legitimate content but actually carry the malicious data in alternate encoding.
Forensic PDF analysis with the Sherlock PDF Editor
The Sherlock PDF Editor exposes the technical structure of PDF documents for forensic analysis without executing the document. Specific extraction capabilities:
JavaScript extraction: all embedded JavaScript fragments are extracted as plain text for analyst review. The JavaScript can be analyzed offline for malicious patterns without execution risk.
Action handler enumeration: OpenAction, URI launches, form submit actions plus other action triggers are surfaced with their target URIs or operations.
Embedded file extraction: files embedded within the PDF are extracted as separate files for individual analysis. Each embedded file gets independent forensic treatment.
Stream object enumeration: the internal PDF stream structure is exposed for analysis of compression, encoding plus content fingerprinting.
Document metadata extraction: creation date, modification date, producer plus author fields plus other metadata are surfaced. Metadata sometimes reveals the actual production tool the attacker used.
Signature analysis: PDF digital signatures are validated. Forged or self-signed signatures get flagged.
Hash plus per-stream signatures: the document hash plus per-stream hashes are recorded for forensic timeline plus chain of custody.
The forensic analysis workflow
The workflow for analyzing a suspect PDF document in incident response or phishing investigation:
Preserve the suspect document. The original PDF is preserved with hash verification plus chain of custody documentation. The Sherlock Disk Imager handles source preservation if the document is part of a broader workstation acquisition.
Acquire metadata first. Before any deeper analysis, extract document metadata. The metadata sometimes immediately reveals the document's origin (e.g., a producer field showing the actual creation tool).
Enumerate action handlers. Identify OpenAction, URI launches plus form actions. Each action target gets independent threat-intel review (is this URI a known malicious indicator, does it match an active phishing campaign).
Extract JavaScript for offline analysis. Embedded JavaScript is extracted plus reviewed without execution. Patterns to look for: app.launchURL calls, util.printd calls, file system access attempts, network access attempts, attempts to detect the running environment.
Extract embedded files individually. Embedded files are extracted plus subjected to independent analysis. An embedded executable inside a PDF is a high-confidence malicious indicator regardless of the JavaScript content.
Cross-reference with email artifacts. If the suspect PDF arrived via email, the Sherlock Sherlock PST Viewer extracts the email headers, sender authentication results plus delivery path for additional forensic context.
Cross-reference with workstation forensics. If the suspect opened the PDF, the workstation forensic timeline shows what happened next. The Sherlock Sherlock Universal Events Viewer reads Windows event logs for process creation, file system access plus network connections that follow the PDF open event.
Examiner report. The forensic examiner produces a written report covering the analysis method, the artifacts surfaced, the threat-intel correlation plus findings. The report is signed plus accompanies the incident response documentation or litigation evidence.
Common phishing PDF patterns
Phishing campaigns deliver malicious PDFs in recognizable patterns. The forensic analysis often surfaces:
The credential harvest landing page redirect. The PDF contains a URI launch that opens a fake login page (Microsoft 365, Google Workspace, banking portal). The user enters credentials thinking they are authenticating to the legitimate service.
The fake document preview. The PDF appears to be a partial document (invoice, contract) but requires the user to click a link to view the full document. The link leads to a credential harvest or malware download.
The embedded executable dropper. The PDF embeds an executable file plus uses an action handler to drop it to disk plus optionally execute it. Modern PDF readers usually require user interaction to launch embedded executables, but social engineering text in the PDF can convince the user to authorize.
The CVE exploit attempt. Less common in modern attacks, but specific PDFs carry exploit payloads targeting known CVEs in Adobe Reader plus alternative readers. Older readers without recent patches remain vulnerable.
The information-gathering ping. Some malicious PDFs do nothing visibly destructive but contact attacker-controlled servers with environment information (IP address, OS fingerprint, reader version, user agent). The information feeds future targeting.
What this means for incident response planning
The mistake incident response teams make is treating suspect PDFs as either fully benign or fully malicious based on file-hash reputation alone. Many malicious PDFs evade hash-based detection through small variations. Forensic content analysis reveals the actual behavior the PDF would exhibit if opened.
The honest practitioner posture is to develop PDF forensic analysis as a routine incident response capability. Suspect emails reported by users, samples from threat intel feeds plus artifacts recovered from compromised workstations all benefit from structural PDF analysis. The combination of metadata extraction, JavaScript surface plus action handler enumeration produces high-confidence verdicts that hash-based detection often misses.
The Sherlock Forensics services practice handles phishing investigation, incident response plus court-defensible forensic examination. The forensic toolchain includes the Sherlock PDF Editor for malicious PDF structural analysis, the Sherlock PST Viewer for delivery email forensics, the Sherlock Universal Events Viewer for post-opening workstation timeline reconstruction, plus the supporting forensic examination services.
Talk to our team about phishing investigation, malicious document analysis or incident response engagement.
Forensic PDF structural analysis reveals what hash detection misses. Get the Sherlock PDF Editor for malicious PDF analysis. Talk to our team about phishing investigation support.