Installation & Requirements

GatherHub requires several dependencies to function properly. This guide will help you set up all the necessary components.

Installation (Debian-based Systems)

Download the binary for your architecture.
Open a terminal, cd to the directory where the download is and run: ./gatherhub-linux-{arch} ; cd deploy; ./setup-services.sh
Goto http://localhost:8060
A system check will run. If 3rd party tools are missing a script called install_deps.sh will be created to install the missing dependencies.

System Requirements

Requirement	Version	Notes
SQLite3	Latest	Required for database operations
External tools	Various	See external tools section below

3rd Party Installation Instructions

Ubuntu/Debian/Raspberry Pi Installation

Run ./gatherhub -web this will do a first-run check of dependencies and create a ./install_deps.sh script for anything missing that you can run. The below is for reference.


                        # Install SQLite
                        sudo apt-get update
                        sudo apt-get install -y sqlite3

                        # Install yt-dlp
                        sudo curl -L https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp -o /usr/local/bin/yt-dlp
                        sudo chmod a+rx /usr/local/bin/yt-dlp

                        # Install git
                        sudo apt-get install -y git

                        # Install aria2
                        sudo apt-get install -y aria2

                        # Install FFmpeg
                        sudo apt-get install -y ffmpeg

                        # Install jq (required for hook scripts)
                        sudo apt-get install -y jq

                        # Install zip (required for creating ZIP archives)
                        sudo apt-get install -y zip

                        # Ensure a browser is installed for SingleFile
                        sudo apt-get install -y chromium-browser

                        # Install MS Docx content extractor
                        sudo apt install -y pandoc

                        # Install PDF content extractor
                        sudo apt install poppler-utils 

                        # Install DOC2XTEXT content extractor
                        sudo apt install docx2text 

                        # Install Tesseract OCR (optional, for comic book text extraction)
                        sudo apt-get install -y tesseract-ocr

Raspberry Pi Specific Instructions


                        # Optional Monolith for HTML file processing
                        sudo curl -L https://github.com/Y2Z/monolith/releases/download/v2.10.1/monolith-gnu-linux-aarch64 -o /usr/local/bin/monolith
                        sudo chmod +x /usr/local/bin/monolith
                        
                        # Single-File-CLI for HTML file processing
                        sudo curl -L https://github.com/gildas-lormeau/single-file-cli/releases/download/v2.0.75/single-file-aarch64-linux -o /usr/local/bin/single-file 
                        sudo chmod +x /usr/local/bin/single-file 

                        # Needed for Monolith on Pi OS 
                        sudo apt install -y libssl1.1

macOS Installation


                        # Install SQLite
                        brew install sqlite

                        # Install yt-dlp
                        brew install yt-dlp

                        # Install git
                        brew install git

                        # Install aria2
                        brew install aria2

                        # Install FFmpeg
                        brew install ffmpeg

                        # Install jq (required for hook scripts)
                        brew install jq

                        # Install zip (required for creating ZIP archives)
                        brew install zip

                        # Install Node.js (required for SingleFile)
                        brew install node

                        # Install SingleFile CLI
                        npm install -g single-file-cli

                        # Ensure a browser is installed for SingleFile
                        brew install --cask google-chrome

                        # Install pandoc (required for extracting text)
                        brew install pandoc 

                        # Install pdftotext (required for extracting text)
                        brew install pdftoipe 

                        # Install doc2text (required for extracting text)
                        brew install doc2text 

                        # Install Tesseract OCR (optional, for comic book text extraction)
                        brew install tesseract


                            GatherHub is written in golang and compiled for Windows. Technically it should mostly work. 
                            I don't have any Windows machines and have made zero effort to test. You built in extractors will
                            likely not work as they're not ported to Windows. You can however modify config.toml to use tools
                            that are available to you.

                            
                                # Install sqlite 
                                https://www.sqlite.org/2025/sqlite-dll-win-x64-3490200.zip

                                # Install single-file cli 
                                https://github.com/gildas-lormeau/single-file-cli/releases/download/v2.0.75/single-file.exe

                                # Install monolith (alternative to single-file)
                                https://github.com/Y2Z/monolith/releases/download/v2.10.1/monolith.exe

                                # Install yt-dlp
                                https://github.com/yt-dlp/yt-dlp/releases/download/2025.05.22/yt-dlp.exe
                                
                                # Install aria2 
                                https://github.com/aria2/aria2/releases/download/release-1.37.0/aria2-1.37.0-win-64bit-build1.zip

                                # Install git 
                                https://github.com/git-for-windows/git/releases/download/v2.49.0.windows.1/Git-2.49.0-64-bit.exe

                                # Install jq (required for hook scripts)
                                Use winget to install jq with winget install jqlang.jq.

External Tools

GatherHub relies on several external tools for different media types. Each tool needs to be installed and available in your system's PATH. For Debian users, use the install_dep.sh script to install dependencies. This file is created automatically with any missing dependencies when you first run ./gatherhub -web.

Tool	Used For	Required
yt-dlp	YouTube videos and playlists	Yes (for streaming video content)
git	Git repositories	Yes (for Git content)
aria2c	General file downloads	Yes (for most file types)
SingleFile	HTML archiving with JavaScript support	Yes (for HTML content)
Monolith	HTML archiving alternative to SingleFile	No (for HTML content)
FFmpeg	Media processing (used by yt-dlp)	Recommended
jq	JSON parsing in hook scripts	Required for hook scripts
zip	Creating ZIP archives	Required for Git repository archives
pandoc	Content Extractor	Required
poppler-utils	Content Extractor	Required
docx2text	Content Extractor	Required
tesseract-ocr	OCR text extraction from comic books (CBZ/CBR)	Optional (enhances comic book extraction)

Post-Installation Verification

After installing all dependencies, you can verify that everything is working correctly by running:

./gatherhub --help

You should see the help message listing all available commands and options. If you encounter any errors, check that all dependencies are properly installed and available in your PATH. If this is the first time executing the gatherhub binary it will create directories and files needed automatically.

If you choose not to use the deploy/setup-services.sh then you will need to execute commands manually. Make sure you execute gatherhub from the directory it's in otherwise you will have issues.

./gatherhub --api --web --daemon

Configuration

After installation, you'll need to configure GatherHub to match your environment. See the Configuration page for details.

Note: Depending on your system and how you installed the tools, you may need to adjust the tool_path values in your configuration to point to the correct locations for each tool.