This quick start guide will help you get up and running with GatherHub in just a few minutes. Follow these steps to start archiving content from various online sources.
GatherHub can run in several modes depending on your needs:
Mode | Command | Description |
---|---|---|
Web Interface | ./gatherhub --web |
Starts the web interface for managing downloads through your browser |
API Server | ./gatherhub --api |
Starts the API server for programmatic access |
Background Service | ./gatherhub --daemon |
Runs in the background to automatically process downloads |
All-in-one | ./gatherhub --web --api --daemon |
Starts all components (recommended for most users) |
When running GatherHub for the first time, it's recommended to access the web interface to configure settings:
./gatherhub --web
http://localhost:8060
GatherHub can import content from various sources:
If you have the daemon running it will automatically scan for new content at the frequency specified in the settings Otherwise to manually scan your configured sources for content:
If you have the daemon running it will automatically downlad jobs at the frequency specified in the settings Otherwise to manually start processing downloads:
You can monitor download progress in several ways:
Some media types require additional configuration:
To download YouTube videos (especially age-restricted or private content), you need to configure cookies:
Git repositories are cloned and also made available as ZIP archives (via event hook) for download through the web interface. No special configuration is needed for public repositories.
Web Archive will never be automatically detected and must be manually selected. You cannot mix and match if you enter more than one url; all urls will be crawled and archived if that media type is selected.
After getting started, consider exploring these features: