Offline-First Multi-Platform Extensible

GatherHub
Your Personal Internet Archiver

Save important content before it disappears forever. GatherHub automatically ingests and archives web pages, videos, documents, repositories and just about anything else, into your own searchable library. Whether for research, backup, or the end of the world - you'll have access to what matters most.

GatherHub Dashboard
Full-Text Search
OCR Extraction
Auto-Archive

See GatherHub in Action

Explore the features that make GatherHub the perfect personal archiving solution

Real-time Dashboard

Monitor your archiving activity with a comprehensive dashboard. Track download progress, view statistics, and get real-time insights into your archive's growth. The dashboard provides an at-a-glance overview of system health and key performance indicators.

  • Live job tracking
  • Archive statistics
  • System health monitoring
  • Quick actions

Job Management

Efficiently manage all your archiving jobs in one centralized location. Search, filter, and organize jobs with powerful tools. Retry failed downloads, perform bulk operations, and track the complete history of every item in your archive.

  • Advanced search and filtering
  • Bulk operations support
  • Retry failed downloads
  • Complete job history

Full-Text Search

Search across all your archived content with intelligent indexing and highlighting. Find information instantly in web pages, documents, videos, and more. Faceted filtering lets you narrow results by media type, source, tags, and date.

  • Full-text search across all content
  • Content highlighting
  • Faceted filtering
  • Advanced search operators

Multi-Source Integration

Configure multiple content sources including browsers, databases, RSS feeds, and bookmark managers. Test connections and manage source settings through an intuitive interface. Automatic scanning keeps your archive up-to-date with minimal effort.

  • Browser bookmark integration
  • Database connections
  • RSS feed monitoring
  • Bookmark manager sync

Built-in Documentation

Comprehensive documentation with its own search engine, all accessible offline. From installation guides to advanced configuration, everything you need is right at your fingertips. The entire UI, docs, search, uploading, and tagging all work without an internet connection.

  • Offline documentation access
  • Searchable content
  • Step-by-step guides
  • API reference
GatherHub Dashboard Job Management Search Interface Source Configuration Documentation

Your Personal Google

Search across everything you've ever saved - from web pages, spreadsheets & docs, PDFs & eBooks, to videos and images

Full-Text Search

Find any word or phrase across all your archived content instantly

OCR Magic

Extract and search text from images, screenshots, and scanned documents

Smart Filtering

Filter by file type, source, tags, date, and more with faceted search

Content Highlighting

See exactly where your search terms appear with intelligent highlighting

Search Your Archive
🔍 machine learning algorithms
📄 Deep Learning Research Paper
...advanced machine learning algorithms for neural networks...
🎥 AI Tutorial Video
...introduction to machine learning algorithms and their applications...
📸 Conference Slide (OCR)
...comparison of machine learning algorithms performance metrics...

Why GatherHub?

Transform your scattered digital content into an organized, searchable personal archive, the works offline.

Multi-Source Archiving

Import from browsers, RSS feeds, databases, and bookmark managers

Full-Text Search

Search across all content with intelligent indexing and faceted filtering

Unlimited Media Types

Handle any content type with custom tools and specialized processing

Modern Web Interface

Dashboard with dark mode, bulk operations, and comprehensive tracking

REST API

Full programmatic access with authentication and bulk operations

Intelligent Processing

Automatic extraction, metadata preservation, and event hooks

What Makes GatherHub Unique

Features you won't find anywhere else - designed for the ultimate archiving experience

Ingest Anything

Not just downloads - upload local files, manual links, 3rd party tools. Completely configurable.

Event Hooks

Write custom hooks in Python, Bash, Node.js. Unlimited automation and chaining possibilities.

Offline-First

Everything works offline: search, docs, uploading, tagging, entire UI. Only downloads need internet.

Integration Hub

Not a replacement - use tools you love and point GatherHub at them for seamless integration.

Extensible Architecture

Custom tooling, APIs, event hooks. Integrate or chain with other tools endlessly.

Text Extraction

Extract from images, scanned docs, PDFs, eBooks. Extend for unsupported formats.

Small Footprint

Single Go binary. Install on Raspberry Pi, laptop, NAS, or dedicated server.

Web Crawling

Grab single pages or crawl and archive entire sites with powerful scraping tools.

Works Alongside Tools

Integrates with Readeck, Linkding, Wallabag, LinkWarden. Keep using what you love.

Connect Your Existing Tools

GatherHub integrates with your current workflow - no need to change how you work

Browser Bookmarks

Scan and import from Firefox, Chrome, Brave, Vivaldi, and Chromium automatically

Bookmark Managers

Connect to Readeck, Linkding, Wallabag, LinkWarden - keep using what you love

Databases

Import from SQLite, MySQL, PostgreSQL and other existing data stores

RSS & Feeds

Monitor RSS feeds, WROLPI, and specialized data sources automatically

Manual Ingest

Add content via URLs, file uploads, and bulk imports alongside automated sources

Integration Philosophy

Not a replacement - GatherHub enhances your tools by archiving content they point to

Out of the box Media Types

Specialized handling for every type of content you encounter

Streaming Videos

YouTube, Vimeo, Twitch, TikTok with yt-dlp, metadata extraction, sponsorblock

Web Pages

Full HTML archiving with JavaScript support via monolith or SingleFile

Documents

PDF, DOCX, TXT with automatic text extraction and indexing

E-books

EPUB, MOBI, AZW with metadata preservation and content extraction

Git Repositories

Clone repos and create optional ZIP archives for complete project preservation

Archives

ZIP, RAR, 7z, Tar, Bz and other compressed formats

Media Files

MP3, MP4, images, and other media formats

Torrents & Magnets

Download torrents and magnet links

Maps & ZIMs

Offline maps, Wikipedia ZIM files, compressed knowledge archives

Mobile Apps & APKs

Archive mobile applications, APK files, and software packages

Custom Types

Define your own media types with custom tools - completely configurable

See GatherHub in Action

Explore the real interface and features that make content archiving effortless

Technical Capabilities

Built for power users and developers who need advanced functionality

REST API

Complete programmatic access with authentication, job management, bulk operations

Event Hooks

Run custom scripts on download events for automated processing and notifications

Content Extraction

Pluggable architecture with OCR support, internal/external and chained processing

Advanced Search

Faceted search with highlighting, field-specific queries, comprehensive indexing

Speed & Portability

Single Go binary compiled for multiple architectures - lightweight and self-contained

Get Started with GatherHub

Free, open source, and ready to transform your content management

Quick Installation

# Download GatherHub
wget https://github.com/optionalsoftware/gatherhub/releases/latest

# Setup services
./gatherhub
cd deploy
./setup-services.sh

Download GatherHub

Get the latest version and start archiving your content today.

Download Now

View Documentation

Comprehensive guides for installation, configuration, and advanced usage.

Read Documentation

Ready to Organize Your Digital Life?

Join thousands of users who have transformed their scattered bookmarks and content into a powerful, searchable personal archive.