GatherHub requires several dependencies to function properly. This guide will help you set up all the necessary components.
./gatherhub-linux-{arch} ; cd deploy; ./setup-services.sh
Requirement | Version | Notes |
---|---|---|
SQLite3 | Latest | Required for database operations |
External tools | Various | See external tools section below |
./gatherhub -web
this will do a first-run check of dependencies and create a ./install_deps.sh
script for anything missing that you can run.
The below is for reference.
# Install SQLite
sudo apt-get update
sudo apt-get install -y sqlite3
# Install yt-dlp
sudo curl -L https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp -o /usr/local/bin/yt-dlp
sudo chmod a+rx /usr/local/bin/yt-dlp
# Install git
sudo apt-get install -y git
# Install aria2
sudo apt-get install -y aria2
# Install FFmpeg
sudo apt-get install -y ffmpeg
# Install jq (required for hook scripts)
sudo apt-get install -y jq
# Install zip (required for creating ZIP archives)
sudo apt-get install -y zip
# Ensure a browser is installed for SingleFile
sudo apt-get install -y chromium-browser
# Install MS Docx content extractor
sudo apt install -y pandoc
# Install PDF content extractor
sudo apt install poppler-utils
# Install DOC2XTEXT content extractor
sudo apt install docx2text
# Install Tesseract OCR (optional, for comic book text extraction)
sudo apt-get install -y tesseract-ocr
# Optional Monolith for HTML file processing
sudo curl -L https://github.com/Y2Z/monolith/releases/download/v2.10.1/monolith-gnu-linux-aarch64 -o /usr/local/bin/monolith
sudo chmod +x /usr/local/bin/monolith
# Single-File-CLI for HTML file processing
sudo curl -L https://github.com/gildas-lormeau/single-file-cli/releases/download/v2.0.75/single-file-aarch64-linux -o /usr/local/bin/single-file
sudo chmod +x /usr/local/bin/single-file
# Needed for Monolith on Pi OS
sudo apt install -y libssl1.1
# Install SQLite
brew install sqlite
# Install yt-dlp
brew install yt-dlp
# Install git
brew install git
# Install aria2
brew install aria2
# Install FFmpeg
brew install ffmpeg
# Install jq (required for hook scripts)
brew install jq
# Install zip (required for creating ZIP archives)
brew install zip
# Install Node.js (required for SingleFile)
brew install node
# Install SingleFile CLI
npm install -g single-file-cli
# Ensure a browser is installed for SingleFile
brew install --cask google-chrome
# Install pandoc (required for extracting text)
brew install pandoc
# Install pdftotext (required for extracting text)
brew install pdftoipe
# Install doc2text (required for extracting text)
brew install doc2text
# Install Tesseract OCR (optional, for comic book text extraction)
brew install tesseract
GatherHub is written in golang and compiled for Windows. Technically it should mostly work.
I don't have any Windows machines and have made zero effort to test. You built in extractors will
likely not work as they're not ported to Windows. You can however modify config.toml to use tools
that are available to you.
# Install sqlite
https://www.sqlite.org/2025/sqlite-dll-win-x64-3490200.zip
# Install single-file cli
https://github.com/gildas-lormeau/single-file-cli/releases/download/v2.0.75/single-file.exe
# Install monolith (alternative to single-file)
https://github.com/Y2Z/monolith/releases/download/v2.10.1/monolith.exe
# Install yt-dlp
https://github.com/yt-dlp/yt-dlp/releases/download/2025.05.22/yt-dlp.exe
# Install aria2
https://github.com/aria2/aria2/releases/download/release-1.37.0/aria2-1.37.0-win-64bit-build1.zip
# Install git
https://github.com/git-for-windows/git/releases/download/v2.49.0.windows.1/Git-2.49.0-64-bit.exe
# Install jq (required for hook scripts)
Use winget to install jq with winget install jqlang.jq.
GatherHub relies on several external tools for different media types. Each tool needs to be installed and available in your system's PATH.
For Debian users, use the install_dep.sh
script to install dependencies. This file is created automatically with any missing
dependencies when you first run
./gatherhub -web
.
Tool | Used For | Required |
---|---|---|
yt-dlp | YouTube videos and playlists | Yes (for streaming video content) |
git | Git repositories | Yes (for Git content) |
aria2c | General file downloads | Yes (for most file types) |
SingleFile | HTML archiving with JavaScript support | Yes (for HTML content) |
Monolith | HTML archiving alternative to SingleFile | No (for HTML content) |
FFmpeg | Media processing (used by yt-dlp) | Recommended |
jq | JSON parsing in hook scripts | Required for hook scripts |
zip | Creating ZIP archives | Required for Git repository archives |
pandoc | Content Extractor | Required |
poppler-utils | Content Extractor | Required |
docx2text | Content Extractor | Required |
tesseract-ocr | OCR text extraction from comic books (CBZ/CBR) | Optional (enhances comic book extraction) |
After installing all dependencies, you can verify that everything is working correctly by running:
./gatherhub --help
You should see the help message listing all available commands and options. If you encounter any errors, check that all dependencies are properly installed and available in your PATH. If this is the first time executing the gatherhub binary it will create directories and files needed automatically.
If you choose not to use the deploy/setup-services.sh
then you will need to execute commands manually.
Make sure you execute gatherhub from the directory it's in otherwise you will have issues.
./gatherhub --api --web --daemon
After installation, you'll need to configure GatherHub to match your environment. See the Configuration page for details.
tool_path
values in your configuration to point to the correct locations for each tool.