Configuration

GatherHub uses a TOML configuration file to control its behavior. This page documents the configuration options available and how to customize them for your needs.

Configuration File Location

By default, GatherHub looks for a configuration file at:

./data/config/config.toml

You can specify a different location using the --config flag:

./gatherhub --config /path/to/custom-config.toml

Configuration Structure

The configuration file is organized into sections, each controlling a different aspect of GatherHub's behavior. Here's an overview of the main sections:

Section	Purpose
`[storage]`	File storage paths and organization
`[web_interface]`	Web interface settings
`[api]`	API server settings
`[concurrency]`	Download worker settings
`[scheduling]`	Automatic scanning and downloading
`[logging]`	Log file paths and levels
`[event_hooks]`	Custom script execution
`[auto_clean]`	Automatic job cleanup
`[[sources]]`	Data source definitions (can have multiple)
`[[media_types]]`	Media type definitions

Example Configuration

If you delete data/config/config.toml a default configuration will be generated for you the next time you run ./gatherhub -web. Handy if you completely mess up your config and want to start from scratch.

Section Details

Storage Settings

The [storage] section controls where downloaded content is stored:

[storage]
base_path = "./downloads"  # Base directory for all downloads

[storage.by_type]
html = "html"              # Subdirectory for HTML content
youtube = "youtube"        # Subdirectory for YouTube videos
# ... other media types

See the Storage Settings page for detailed information.

Web Interface Settings

The [web_interface] section controls the web UI:

[web_interface]
enabled = true             # Enable/disable the web interface
host = "0.0.0.0"           # Interface to listen on (0.0.0.0 for all)
port = 8060                # Port to listen on
session_timeout_minutes = 1440  # Session timeout in minutes (24 hours)
allow_iframe = false       # Whether to allow embedding in iframes

# User authentication
[web_interface.users]
[[web_interface.users]]
username = "admin"
password_hash = "scrypt:32768:8:1$salt$hash"  # Scrypt-hashed password

Note: Passwords are stored as secure hashes. You can't directly edit password_hash values in the config file. Use the web interface to change passwords.

API Settings

The [api] section controls the REST API:

[api]
enabled = true             # Enable/disable the API
host = "127.0.0.1"         # Interface to listen on (127.0.0.1 for localhost only)
port = 5000                # Port to listen on
debug = false              # Enable debug mode

[api.auth]
enabled = true             # Enable/disable API authentication
api_secret = "your-secret-key"  # Secret for API KEY
apiKey_expiry_hours = 24    # Token expiration time

See the API Reference page for more details.

Concurrency Settings

The [concurrency] section controls download workers:

[concurrency]
max_workers = 3           # Maximum number of simultaneous downloads
timeout_seconds = 600     # Download timeout in seconds (10 minutes)

Scheduling Settings

The [scheduling] section controls automatic operations:

[scheduling]
enabled = true             # Enable/disable scheduled operations
interval_minutes = 60      # How often to run scheduled tasks

See the Scheduling page for more details.

Logging Settings

The [logging] section controls log files and verbosity:

[logging]
app_log_path = "./data/logs/app.log"            # Application log
activity_log_path = "./data/logs/activity.log"  # Activity log
error_log_path = "./data/logs/error.log"        # Error log
level = "info"             # Log level (debug, info, warn, error)
console_logging = true     # Also log to console

Event Hooks Settings

The [event_hooks] section controls custom scripts:

[event_hooks]
enabled = true             # Enable/disable event hooks
hooks_dir = "./data/hooks" # Directory containing hook scripts

# Individual hook definitions
[[event_hooks.hooks]]
event = "post_download"    # When the hook triggers
script = "notify.py"       # Script filename
enabled = true             # Enable/disable this specific hook
description = "Notifies when downloads complete"  # Optional description

See the Event Hooks page for more details.

Auto-Clean Settings

The [auto_clean] section controls automatic job cleanup:

[auto_clean]
enabled = true             # Enable/disable auto-clean functionality
retry_failed = true        # Attempt to retry failed jobs before cleaning
max_retries = 3            # Maximum retry attempts for failed jobs
clean_after_days = 30      # Remove completed jobs older than this many days

The auto-clean functionality has two independent components:

Failed Job Cleanup: Removes failed jobs that have exceeded the maximum retry attempts (controlled by max_retries).
Completed Job Cleanup: Removes completed jobs older than the specified number of days (controlled by clean_after_days).

Tip: To clean only failed jobs while preserving all completed jobs, set enabled = true and clean_after_days = 0. A value of 0 for clean_after_days will disable the cleanup of completed jobs while still allowing failed jobs to be cleaned up.

Media Types Settings

The [[media_types]] section defines how different content types are handled:

[[media_types]]
name = "youtube"           # Media type name
tool = "yt-dlp"            # External tool to use
tool_path = ""             # Optional path to tool executable
arguments = "-f 'bestvideo[height<=1080]+bestaudio/best[height<=1080]'"  # Tool arguments
patterns = ["youtube.com/watch", "youtu.be/"]  # URL patterns for auto-detection

Editing Configuration

You can edit the configuration in several ways:

Text Editor

Edit config.toml directly with any text editor. GatherHub will read the changes without needing to restart.

Web Interface

Many configuration options can be changed through the Settings page in the web interface, which will automatically update the config file.

Warning: Some advanced settings are only available by editing the configuration file directly.

Configuration Validation

GatherHub validates the configuration at startup and will report any errors in the console or log files. If there are configuration errors, GatherHub will attempt to use default values where possible.