Snapshotting and Screenshotting
DPULSE provides multiple methods to capture and preserve target website content for analysis, documentation, and offline review.
Overview
| Method | Type | Output | Interactive |
|---|---|---|---|
| Screenshot | Visual capture | Image file | ❌ No |
| PageCopy | Complete copy | HTML + assets | ✅ Yes |
| Wayback Snapshot | Historical versions | Multiple HTML files | ✅ Yes |
Note: You will be prompted to select snapshotting mode during pre-scan configuration.
Screenshot vs Snapshot
| Feature | Screenshot | Snapshot |
|---|---|---|
| What it captures | Visual appearance at a moment | Full page structure and content |
| File format | Image (PNG/JPG) | HTML + assets |
| Interactive | ❌ View only | ✅ Navigate, click, inspect |
| Offline use | ✅ Yes | ✅ Yes |
| Use case | Documentation, reports | Analysis, forensics |
📸 Screenshot Mode
Takes a visual capture of the domain's main page using headless browser technology.
How It Works
- DPULSE launches a browser instance using Selenium
- Navigates to target domain
- Captures full-page screenshot
- Saves image to report folder
Configuration
Important: Proper configuration is required for screenshotting to work correctly. See Configuration File for details.
Output
reports/
└── example.com_2024-01-15/
└── screenshot.png
📄 HTML Snapshot (Web Copy)
Saves the webpage as an HTML file, preserving structure and interactivity.
What It Captures
| Element | Preserved |
|---|---|
| HTML code | ✅ Yes |
| DOM structure | ✅ Yes |
| Text content | ✅ Yes |
| Links | ✅ Yes |
| Forms | ✅ Yes |
| Inline styles | ✅ Yes |
Output
reports/
└── example.com_2024-01-15/
└── snapshot.html
📦 PageCopy Mode
Creates a complete offline copy of the page including all assets.
What It Captures
| Asset Type | Included |
|---|---|
| HTML | ✅ Yes |
| CSS stylesheets | ✅ Yes |
| JavaScript files | ✅ Yes |
| Images | ✅ Yes |
| Fonts | ✅ Yes |
Output
reports/
└── example.com_2024-01-15/
└── pagecopy/
├── index.html
├── styles.css
├── script.js
└── images/
🕰️ Wayback Machine Snapshot
Retrieves historical versions of the target domain from the Internet Archive.
How It Works
- Connects to Wayback Machine API
- Queries for available snapshots within specified time period
- Downloads selected historical versions
- Saves all versions to report folder
Configuration
You specify the time period during scan setup:
Output
reports/
└── example.com_2024-01-15/
└── wayback_snapshots/
├── 2023-01-15_snapshot.html
├── 2023-06-20_snapshot.html
└── 2024-01-01_snapshot.html
When to Use Each Method
| Scenario | Recommended Method |
|---|---|
| Quick visual documentation | 📸 Screenshot |
| Preserve page for analysis | 📄 HTML Snapshot |
| Full offline copy with assets | 📦 PageCopy |
| Track changes over time | 🕰️ Wayback Snapshot |
| Legal/forensic evidence | 📸 Screenshot + 📄 HTML Snapshot |
| Website no longer exists | 🕰️ Wayback Snapshot |
Output Location
All captures are saved in the scan report folder:
reports/
└── example.com_2024-01-15/
├── report.html
├── screenshot.png
├── snapshot.html
├── pagecopy/
└── wayback_snapshots/