GatherHub
Your Personal Internet Archiver
Save important content before it disappears forever. GatherHub automatically ingests and archives web pages, videos, documents, repositories and just about anything else, into your own searchable library. Whether for research, backup, or the end of the world - you'll have access to what matters most.
See GatherHub in Action
Explore the features that make GatherHub the perfect personal archiving solution
Real-time Dashboard
Monitor your archiving activity with a comprehensive dashboard. Track download progress, view statistics, and get real-time insights into your archive's growth. The dashboard provides an at-a-glance overview of system health and key performance indicators.
- Live job tracking
- Archive statistics
- System health monitoring
- Quick actions
Job Management
Efficiently manage all your archiving jobs in one centralized location. Search, filter, and organize jobs with powerful tools. Retry failed downloads, perform bulk operations, and track the complete history of every item in your archive.
- Advanced search and filtering
- Bulk operations support
- Retry failed downloads
- Complete job history
Full-Text Search
Search across all your archived content with intelligent indexing and highlighting. Find information instantly in web pages, documents, videos, and more. Faceted filtering lets you narrow results by media type, source, tags, and date.
- Full-text search across all content
- Content highlighting
- Faceted filtering
- Advanced search operators
Multi-Source Integration
Configure multiple content sources including browsers, databases, RSS feeds, and bookmark managers. Test connections and manage source settings through an intuitive interface. Automatic scanning keeps your archive up-to-date with minimal effort.
- Browser bookmark integration
- Database connections
- RSS feed monitoring
- Bookmark manager sync
Built-in Documentation
Comprehensive documentation with its own search engine, all accessible offline. From installation guides to advanced configuration, everything you need is right at your fingertips. The entire UI, docs, search, uploading, and tagging all work without an internet connection.
- Offline documentation access
- Searchable content
- Step-by-step guides
- API reference
Your Personal Google
Search across everything you've ever saved - from web pages, spreadsheets & docs, PDFs & eBooks, to videos and images
Full-Text Search
Find any word or phrase across all your archived content instantly
OCR Magic
Extract and search text from images, screenshots, and scanned documents
Smart Filtering
Filter by file type, source, tags, date, and more with faceted search
Content Highlighting
See exactly where your search terms appear with intelligent highlighting
Why GatherHub?
Transform your scattered digital content into an organized, searchable personal archive, the works offline.
Multi-Source Archiving
Import from browsers, RSS feeds, databases, and bookmark managers
Full-Text Search
Search across all content with intelligent indexing and faceted filtering
Unlimited Media Types
Handle any content type with custom tools and specialized processing
Modern Web Interface
Dashboard with dark mode, bulk operations, and comprehensive tracking
REST API
Full programmatic access with authentication and bulk operations
Intelligent Processing
Automatic extraction, metadata preservation, and event hooks
What Makes GatherHub Unique
Features you won't find anywhere else - designed for the ultimate archiving experience
Ingest Anything
Not just downloads - upload local files, manual links, 3rd party tools. Completely configurable.
Event Hooks
Write custom hooks in Python, Bash, Node.js. Unlimited automation and chaining possibilities.
Offline-First
Everything works offline: search, docs, uploading, tagging, entire UI. Only downloads need internet.
Integration Hub
Not a replacement - use tools you love and point GatherHub at them for seamless integration.
Extensible Architecture
Custom tooling, APIs, event hooks. Integrate or chain with other tools endlessly.
Text Extraction
Extract from images, scanned docs, PDFs, eBooks. Extend for unsupported formats.
Small Footprint
Single Go binary. Install on Raspberry Pi, laptop, NAS, or dedicated server.
Web Crawling
Grab single pages or crawl and archive entire sites with powerful scraping tools.
Works Alongside Tools
Integrates with Readeck, Linkding, Wallabag, LinkWarden. Keep using what you love.
Connect Your Existing Tools
GatherHub integrates with your current workflow - no need to change how you work
Browser Bookmarks
Scan and import from Firefox, Chrome, Brave, Vivaldi, and Chromium automatically
Bookmark Managers
Connect to Readeck, Linkding, Wallabag, LinkWarden - keep using what you love
Databases
Import from SQLite, MySQL, PostgreSQL and other existing data stores
RSS & Feeds
Monitor RSS feeds, WROLPI, and specialized data sources automatically
Manual Ingest
Add content via URLs, file uploads, and bulk imports alongside automated sources
Integration Philosophy
Not a replacement - GatherHub enhances your tools by archiving content they point to
Out of the box Media Types
Specialized handling for every type of content you encounter
Streaming Videos
YouTube, Vimeo, Twitch, TikTok with yt-dlp, metadata extraction, sponsorblock
Web Pages
Full HTML archiving with JavaScript support via monolith or SingleFile
Documents
PDF, DOCX, TXT with automatic text extraction and indexing
E-books
EPUB, MOBI, AZW with metadata preservation and content extraction
Git Repositories
Clone repos and create optional ZIP archives for complete project preservation
Archives
ZIP, RAR, 7z, Tar, Bz and other compressed formats
Media Files
MP3, MP4, images, and other media formats
Torrents & Magnets
Download torrents and magnet links
Maps & ZIMs
Offline maps, Wikipedia ZIM files, compressed knowledge archives
Mobile Apps & APKs
Archive mobile applications, APK files, and software packages
Custom Types
Define your own media types with custom tools - completely configurable
See GatherHub in Action
Explore the real interface and features that make content archiving effortless
Technical Capabilities
Built for power users and developers who need advanced functionality
REST API
Complete programmatic access with authentication, job management, bulk operations
Event Hooks
Run custom scripts on download events for automated processing and notifications
Content Extraction
Pluggable architecture with OCR support, internal/external and chained processing
Advanced Search
Faceted search with highlighting, field-specific queries, comprehensive indexing
Speed & Portability
Single Go binary compiled for multiple architectures - lightweight and self-contained
Get Started with GatherHub
Free, open source, and ready to transform your content management
Quick Installation
# Download GatherHub
wget https://github.com/optionalsoftware/gatherhub/releases/latest
# Setup services
./gatherhub
cd deploy
./setup-services.sh
View Documentation
Comprehensive guides for installation, configuration, and advanced usage.
Read DocumentationReady to Organize Your Digital Life?
Join thousands of users who have transformed their scattered bookmarks and content into a powerful, searchable personal archive.