Related Tools
Note: GatherHub can be complimentary and used with most or all of these other solutions
This is not a comprehensive list; these are tools that are loosely similar or provide one or more of similar functionality
that can be self-hosted. Links are provided to these other solutions in the hopes one may suit your needs better
if GatherHub does not.
Several of these inspired the creation of GatherHub.
- Self-hosting and archiving in general (Archivebox).
- Golang for speed and ease of installation (single binary ala Readeck).
- Fully functional offline solution providing a library of your media (WROLPI)
On mobile, table data is shown as individual cards for better readability.
Site |
Description |
Language |
WROLPI |
Self-hosted OS solution + Archiver: A offline library of media, web archives, maps, and other resources, accessible via a Wi-Fi network without relying on external internet access |
PYTHON |
ArchiveBox |
Self-hosted Web Archive: A internet archiving solution to collect, save, and view websites offline |
PYTHON |
SOSSE |
Self-hosted Crawler & Archiver: A a web archiving software, crawler, and search engine |
PYTHON |
Paperless-ngx |
Searchable online Archiver: A document management system that transforms your physical documents into a searchable online archive |
PYTHON |
Readeck |
Self-hosted Web Archiver: Save the precious readable content of web pages you like and want to keep forever |
GOLANG |
LinkAce |
Self-hosted Web Archiver: A self-hosted web application for managing a bookmark archive |
PHP |
Description:
Self-hosted OS solution + Archiver: A offline library of media, web archives, maps, and other resources, accessible via a Wi-Fi network without relying on external internet access
Language:
PYTHON
Description:
Self-hosted Web Archive: A internet archiving solution to collect, save, and view websites offline
Language:
PYTHON
Description:
Self-hosted Crawler & Archiver: A a web archiving software, crawler, and search engine
Language:
PYTHON
Description:
Searchable online Archiver: A document management system that transforms your physical documents into a searchable online archive
Language:
PYTHON
Description:
Self-hosted Web Archiver: Save the precious readable content of web pages you like and want to keep forever
Language:
GOLANG
Description:
Self-hosted Web Archiver: A self-hosted web application for managing a bookmark archive
Language:
PHP
How does GatherHub Differ?
This information in the table may be out of date, please investigate each.
Notable features that are likely unique to GatherHub
- Upload and download just about anything
- What you download and the tool to download is completely configurable
- Extractors of text from many formats, including OCR for CBZ files
- Event hooks/triggers can be written in any language to do whatever you want
- Your own personal Google; a dedicated search engine is built in. Similar tech, but separate is used for the Documentation search
- Everything is designed to work offline; search, docs, uploading, tagging, the UI. Just not internet downloading.
- Ingestion sources are configurable and support more than just URLs (repos, bookmark managers, databases, RSS feeds, etc)
- Designed not to be a bookmark manager; use the tools you love and just point GatherHub at them
- Highly extensible; with the custom tooling configuration, APIs, and Event hooks you can integrate or chain GatherHub to other tools to do even more
Feature |
GatherHub |
WROLPi |
ArchiveBox |
SOSSE |
Paperless-ngx |
Readeck |
LinkAce |
Multi-source ingestion | ✅ | ❌ | ✅ | ✅ | ✅ | ❌ | ✅ |
Full webpage snapshot | ✅ | ✅ | ✅ | ✅ | ❌ | ✅ (Readable only) | ❌ |
Domain Crawling | ✅ | ❌ | ⚠️ Limited | ✅ | ❌ | ❌ | ❌ |
Readable text extraction | ✅ | ❌ | ✅ | ✅ | ❌ | ✅ | ❌ |
OCR Support | ✅ | ❌ | ❌ | ❌ | ✅ | ❌ | ❌ |
Archive anything (rss,repos, torrents) | ✅ | ⚠️ No repos | ⚠️ No repos, EPUB, complex formats | ⚠️ No repos, EPUB, limited files | ❌ Docs only | ❌ Web text only | ❌ URLs only |
Upload: any files | ✅ | ✅ | ❌ | ❌ | ✅ | ❌ | ❌ |
Tooling flexibility | ✅ Configurable | ⚠️ Predefined | ⚠️ Limited (preset tools) | ❌ | ❌ | ❌ | ❌ |
Event hooks / triggers | ✅ Full support | ❌ | ⚠️ CLI wrapping only | ❌ | ⚠️ Basic OCR triggers | ❌ | ❌ |
Content manager (UI + tagging) | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
Dedicated search engine | ✅ | ❌ SQL Query | ⚠️ Optional | ✅ | ✅ | ❌ SQL Query | ❌ SQL Query |
REST API | ✅ | ❌ | ✅ | ❌ | ✅ | ❌ | ✅ |
Job tracking & logging | ✅ Full DB | ❌ | ⚠️ Logs only | ⚠️ Basic logs only | ✅ OCR-focused logs | ❌ | ❌ |
Offline LAN access | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |
Extensibility / Modularity | ✅ | ❌ | ⚠️ Some plugin/tool config | ❌ | ❌ | ❌ | ❌ |
On mobile, features are shown as individual cards for better readability.
Multi-source ingestion
GatherHub✅
WROLPi❌
ArchiveBox✅
SOSSE✅
Paperless-ngx✅
Readeck❌
LinkAce✅
Full webpage snapshot
GatherHub✅
WROLPi✅
ArchiveBox✅
SOSSE✅
Paperless-ngx❌
Readeck✅ (Readable only)
LinkAce❌
Domain Crawling
GatherHub✅
WROLPi❌
ArchiveBox⚠️ Limited
SOSSE✅
Paperless-ngx❌
Readeck❌
LinkAce❌
Readable text extraction
GatherHub✅
WROLPi❌
ArchiveBox✅
SOSSE✅
Paperless-ngx❌
Readeck✅
LinkAce❌
Archive anything (rss,repos, torrents)
GatherHub✅
WROLPi⚠️ No repos
ArchiveBox⚠️ No repos, EPUB, complex formats
SOSSE⚠️ No repos, EPUB, limited files
Paperless-ngx❌ Docs only
Readeck❌ Web text only
LinkAce❌ URLs only
Upload: any files
GatherHub✅
WROLPi✅
ArchiveBox❌
SOSSE❌
Paperless-ngx✅
Readeck❌
LinkAce❌
Tooling flexibility
GatherHub✅ Configurable
WROLPi⚠️ Predefined
ArchiveBox⚠️ Limited (preset tools)
SOSSE❌
Paperless-ngx❌
Readeck❌
LinkAce❌
Event hooks / triggers
GatherHub✅ Full support
WROLPi❌
ArchiveBox⚠️ CLI wrapping only
SOSSE❌
Paperless-ngx⚠️ Basic OCR triggers
Readeck❌
LinkAce❌
Content manager (UI + tagging)
GatherHub✅
WROLPi✅
ArchiveBox✅
SOSSE✅
Paperless-ngx✅
Readeck✅
LinkAce✅
Dedicated search engine
GatherHub✅
WROLPi❌ SQL Query
ArchiveBox⚠️ Optional
SOSSE✅
Paperless-ngx✅
Readeck❌ SQL Query
LinkAce❌ SQL Query
REST API
GatherHub✅
WROLPi❌
ArchiveBox✅
SOSSE❌
Paperless-ngx✅
Readeck❌
LinkAce✅
Job tracking & logging
GatherHub✅ Full DB
WROLPi❌
ArchiveBox⚠️ Logs only
SOSSE⚠️ Basic logs only
Paperless-ngx✅ OCR-focused logs
Readeck❌
LinkAce❌
Offline LAN access
GatherHub✅
WROLPi✅
ArchiveBox✅
SOSSE✅
Paperless-ngx✅
Readeck✅
LinkAce✅
Extensibility / Modularity
GatherHub✅
WROLPi❌
ArchiveBox⚠️ Some plugin/tool config
SOSSE❌
Paperless-ngx❌
Readeck❌
LinkAce❌
Related Links