Installation & Requirements

GatherHub requires several dependencies to function properly. This guide will help you set up all the necessary components.

System Requirements

Requirement Version Notes
SQLite3 Latest Required for database operations
External tools Various See external tools section below

3rd Party Installation Instructions

Ubuntu/Debian/Raspberry Pi Installation

Run ./gatherhub -web this will do a first-run check of dependencies and create a ./install_deps.sh script for anything missing that you can run. The below is for reference.

                        # Install SQLite
                        sudo apt-get update
                        sudo apt-get install -y sqlite3

                        # Install yt-dlp
                        sudo curl -L https://github.com/yt-dlp/yt-dlp/releases/latest/download/yt-dlp -o /usr/local/bin/yt-dlp
                        sudo chmod a+rx /usr/local/bin/yt-dlp

                        # Install git
                        sudo apt-get install -y git

                        # Install aria2
                        sudo apt-get install -y aria2

                        # Install FFmpeg
                        sudo apt-get install -y ffmpeg

                        # Install jq (required for hook scripts)
                        sudo apt-get install -y jq

                        # Install zip (required for creating ZIP archives)
                        sudo apt-get install -y zip

                        # Ensure a browser is installed for SingleFile
                        sudo apt-get install -y chromium-browser

                        # Install MS Docx content extractor
                        sudo apt install -y pandoc

                        # Install PDF content extractor
                        sudo apt install poppler-utils 

                        # Install DOC2XTEXT content extractor
                        sudo apt install docx2text 

                        # Install Tesseract OCR (optional, for comic book text extraction)
                        sudo apt-get install -y tesseract-ocr 

                        

Raspberry Pi Specific Instructions


                        # Optional Monolith for HTML file processing
                        sudo curl -L https://github.com/Y2Z/monolith/releases/download/v2.10.1/monolith-gnu-linux-aarch64 -o /usr/local/bin/monolith
                        sudo chmod +x /usr/local/bin/monolith
                        
                        # Single-File-CLI for HTML file processing
                        sudo curl -L https://github.com/gildas-lormeau/single-file-cli/releases/download/v2.0.75/single-file-aarch64-linux -o /usr/local/bin/single-file 
                        sudo chmod +x /usr/local/bin/single-file 

                        # Needed for Monolith on Pi OS 
                        sudo apt install -y libssl1.1 

                        

macOS Installation


                        # Install SQLite
                        brew install sqlite

                        # Install yt-dlp
                        brew install yt-dlp

                        # Install git
                        brew install git

                        # Install aria2
                        brew install aria2

                        # Install FFmpeg
                        brew install ffmpeg

                        # Install jq (required for hook scripts)
                        brew install jq

                        # Install zip (required for creating ZIP archives)
                        brew install zip

                        # Install Node.js (required for SingleFile)
                        brew install node

                        # Install SingleFile CLI
                        npm install -g single-file-cli

                        # Ensure a browser is installed for SingleFile
                        brew install --cask google-chrome

                        # Install pandoc (required for extracting text)
                        brew install pandoc 

                        # Install pdftotext (required for extracting text)
                        brew install pdftoipe 

                        # Install doc2text (required for extracting text)
                        brew install doc2text 

                        # Install Tesseract OCR (optional, for comic book text extraction)
                        brew install tesseract 
                        

                            GatherHub is written in golang and compiled for Windows. Technically it should mostly work. 
                            I don't have any Windows machines and have made zero effort to test. You built in extractors will
                            likely not work as they're not ported to Windows. You can however modify config.toml to use tools
                            that are available to you. 
                        
                            
                                # Install sqlite 
                                https://www.sqlite.org/2025/sqlite-dll-win-x64-3490200.zip

                                # Install single-file cli 
                                https://github.com/gildas-lormeau/single-file-cli/releases/download/v2.0.75/single-file.exe

                                # Install monolith (alternative to single-file)
                                https://github.com/Y2Z/monolith/releases/download/v2.10.1/monolith.exe

                                # Install yt-dlp
                                https://github.com/yt-dlp/yt-dlp/releases/download/2025.05.22/yt-dlp.exe
                                
                                # Install aria2 
                                https://github.com/aria2/aria2/releases/download/release-1.37.0/aria2-1.37.0-win-64bit-build1.zip

                                # Install git 
                                https://github.com/git-for-windows/git/releases/download/v2.49.0.windows.1/Git-2.49.0-64-bit.exe

                                # Install jq (required for hook scripts)
                                Use winget to install jq with winget install jqlang.jq.

                            
                        

External Tools

GatherHub relies on several external tools for different media types. Each tool needs to be installed and available in your system's PATH. For Debian users, use the install_dep.sh script to install dependencies. This file is created automatically with any missing dependencies when you first run ./gatherhub -web.

Tool Used For Required
yt-dlp YouTube videos and playlists Yes (for streaming video content)
git Git repositories Yes (for Git content)
aria2c General file downloads Yes (for most file types)
SingleFile HTML archiving with JavaScript support Yes (for HTML content)
Monolith HTML archiving alternative to SingleFile No (for HTML content)
FFmpeg Media processing (used by yt-dlp) Recommended
jq JSON parsing in hook scripts Required for hook scripts
zip Creating ZIP archives Required for Git repository archives
pandoc Content Extractor Required
poppler-utils Content Extractor Required
docx2text Content Extractor Required
tesseract-ocr OCR text extraction from comic books (CBZ/CBR) Optional (enhances comic book extraction)

Post-Installation Verification

After installing all dependencies, you can verify that everything is working correctly by running:

./gatherhub --help

You should see the help message listing all available commands and options. If you encounter any errors, check that all dependencies are properly installed and available in your PATH. If this is the first time executing the gatherhub binary it will create directories and files needed automatically.

If you choose not to use the deploy/setup-services.sh then you will need to execute commands manually. Make sure you execute gatherhub from the directory it's in otherwise you will have issues.

./gatherhub --api --web --daemon

Configuration

After installation, you'll need to configure GatherHub to match your environment. See the Configuration page for details.

Note: Depending on your system and how you installed the tools, you may need to adjust the tool_path values in your configuration to point to the correct locations for each tool.
Search Results

Type to search documentation...