GatherHub supports an event hook system that allows you to execute custom scripts when certain events occur. This enables integration with other systems and customization of the download workflow.
Event hooks are executable scripts located in the hooks directory (configured as event_hooks.hooks_dir
in your config.toml).
When an event occurs, GatherHub will:
Event | Description | Trigger Point |
---|---|---|
pre_download |
Called before a download starts | After job is selected for download but before download begins |
post_download |
Called after a download completes successfully | After download finishes and job status is set to "completed" |
on_error |
Called when a download fails | After job status is set to "failed" |
on_status_change |
Called when a job's status changes | After any job status update |
on_source_scan |
Called when a source is scanned | After a scan operation completes |
on_startup |
Called when the application starts | During application initialization |
on_shutdown |
Called when the application shuts down | During graceful shutdown |
Hooks are configured in the config.toml
file:
[event_hooks]
enabled = true
hooks_dir = './data/hooks'
# Hook 1: Generic notification hook
[[event_hooks.hooks]]
event = 'post_download'
script = 'notify.py'
enabled = true
# Hook 2: Git ZIP archive creation
[[event_hooks.hooks]]
event = 'post_download'
script = 'gatherhub_zip_git_repos.sh'
enabled = true
description = "System hook: Creates ZIP archives of git repositories for web download"
Each hook definition needs:
event
: The event that triggers this hookscript
: The script filename (relative to hooks_dir)enabled
: Whether this hook is activedescription
: (Optional) A human-readable description of the hookHooks receive data in two ways:
Event-specific data is passed as JSON to the hook's standard input. For example, a post_download
hook receives:
{
"id": 42, // Job ID
"url": "https://example.com", // Original URL
"media_type": "html", // Media type that was downloaded
"file_path": "/path/to/file.html", // Path where the file was saved
"status": "completed" // Job status
}
These environment variables are set for all hook executions:
Variable | Description | Example |
---|---|---|
GATHERHUB_EVENT |
The name of the event that triggered the hook | post_download |
GATHERHUB_APP_LOG |
Path to the application log file | ./data/logs/app.log |
GATHERHUB_ACTIVITY_LOG |
Path to the activity log file | ./data/logs/activity.log |
GATHERHUB_ERROR_LOG |
Path to the error log file | ./data/logs/error.log |
Hooks can be written in any language that can read from standard input. Here's a simple example in Python:
#!/usr/bin/env python3
# hooks/notify.py
import json
import sys
import os
# Read JSON data from stdin
data = json.load(sys.stdin)
# Get the event type
event = os.environ.get('GATHERHUB_EVENT')
# Log the event
with open(os.environ.get('GATHERHUB_ACTIVITY_LOG', './data/logs/activity.log'), 'a') as f:
f.write(f"Hook received event: {event}\n")
f.write(f"Event data: {json.dumps(data, indent=2)}\n")
f.write("-" * 50 + "\n")
# Exit with success
sys.exit(0)
chmod +x your_hook.py
).
Here's a simple Bash hook example:
#!/bin/bash
# hooks/example.sh
# Read JSON data from stdin
json_data=$(cat)
# Extract a value from the JSON (requires jq)
url=$(echo "$json_data" | jq -r '.url')
# Log to the activity log
echo "[$(date)] Hook received URL: $url" >> "$GATHERHUB_ACTIVITY_LOG"
# Exit with success
exit 0
Your hook script should return appropriate exit codes:
0
: Success - hook executed successfullyIf a hook returns a non-zero exit code, GatherHub will log the error but continue processing. Hook failures don't affect the job status.
GatherHub includes built-in system hooks:
The gatherhub_zip_git_repos.sh
hook creates ZIP archives of downloaded Git repositories. It:
zip
and jq
commands to be installed on your system.
The notify.py
hook provides basic activity logging for events. It:
Problem | Possible Cause | Solution |
---|---|---|
Hook not executing | Script not executable | Run chmod +x your_hook.sh |
Permission denied | Insufficient permissions | Check file ownership and permissions |
Command not found | Missing dependencies | Install required tools (e.g., jq for JSON parsing) |
JSON parsing error | Malformed JSON or incorrect parsing | Verify your JSON parsing code and input format |
set -x
in Bash scripts to see executed commands