When to use
- Accessing unavailable or deleted pages
- Preserving web content before it changes or disappears
- Creating legal evidence archives with chain of custody
- Building redundant archival workflows for research
Archive service comparison
| Service | Best for |
|---|---|
| Wayback Machine | Historical research |
| Archive.today | Paywall bypass, quick saves |
| Perma.cc | Legal citations |
| ArchiveBox | Self-hosted, privacy |
| Conifer | Interactive content |
What's included
Wayback Machine API
Python code for checking availability, saving pages, and retrieving historical snapshots via CDX API.
Multi-archive redundancy
Archive to Wayback, Archive.today, and Perma.cc simultaneously for maximum preservation.
Legal evidence preservation
Chain of custody documentation, content hashing, and timestamped capture records.
ArchiveBox integration
Self-hosted archiving setup, Python integration, and scheduled archiving workflows.
Archive retrieval cascade
Try services in this order for maximum coverage:
Wayback Machine (archive.org)
916B+ pages, historical depth, API access
Archive.today (archive.is/archive.ph)
On-demand snapshots, paywall bypass
Google Cache
Recent pages, search: cache:url
Bing Cache
Click dropdown arrow in search results
Memento Time Travel (aggregator)
Searches multiple archives simultaneously
Installation
# Clone the repository
git clone https://github.com/jamditis/claude-skills-journalism.git
# Copy the skill to your Claude config
cp -r claude-skills-journalism/web-archiving ~/.claude/skills/
Or download just this skill from the GitHub repository.
Related skills
Archive before it disappears
Wayback Machine API, multi-archive redundancy, and legal evidence preservation in one skill.
View on GitHub