Event Hooks

GatherHub supports an event hook system that allows you to execute custom scripts when certain events occur. This enables integration with other systems and customization of the download workflow.

How Event Hooks Work

Event hooks are executable scripts located in the hooks directory (configured as event_hooks.hooks_dir in your config.toml). When an event occurs, GatherHub will:

  1. Check if any hooks are registered for that event
  2. Execute each hook script in sequence
  3. Provide event data to the script via stdin as JSON
  4. Set environment variables with context about the event
  5. Log the hook execution result

Supported Events

Event Description Trigger Point
pre_download Called before a download starts After job is selected for download but before download begins
post_download Called after a download completes successfully After download finishes and job status is set to "completed"
on_error Called when a download fails After job status is set to "failed"
on_status_change Called when a job's status changes After any job status update
on_source_scan Called when a source is scanned After a scan operation completes
on_startup Called when the application starts During application initialization
on_shutdown Called when the application shuts down During graceful shutdown

Hook Configuration

Hooks are configured in the config.toml file:

[event_hooks]
enabled = true
hooks_dir = './data/hooks'

# Hook 1: Generic notification hook
[[event_hooks.hooks]]
event = 'post_download'
script = 'notify.py'
enabled = true

# Hook 2: Git ZIP archive creation
[[event_hooks.hooks]]
event = 'post_download'
script = 'gatherhub_zip_git_repos.sh'
enabled = true
description = "System hook: Creates ZIP archives of git repositories for web download"

Each hook definition needs:

  • event: The event that triggers this hook
  • script: The script filename (relative to hooks_dir)
  • enabled: Whether this hook is active
  • description: (Optional) A human-readable description of the hook

Hook Data and Environment

Hooks receive data in two ways:

1. Standard Input (JSON)

Event-specific data is passed as JSON to the hook's standard input. For example, a post_download hook receives:

{
  "id": 42,                           // Job ID
  "url": "https://example.com",       // Original URL
  "media_type": "html",               // Media type that was downloaded
  "file_path": "/path/to/file.html",  // Path where the file was saved
  "status": "completed"               // Job status
}

2. Environment Variables

These environment variables are set for all hook executions:

Variable Description Example
GATHERHUB_EVENT The name of the event that triggered the hook post_download
GATHERHUB_APP_LOG Path to the application log file ./data/logs/app.log
GATHERHUB_ACTIVITY_LOG Path to the activity log file ./data/logs/activity.log
GATHERHUB_ERROR_LOG Path to the error log file ./data/logs/error.log

Creating Custom Hooks

Basic Hook Structure

Hooks can be written in any language that can read from standard input. Here's a simple example in Python:

#!/usr/bin/env python3
# hooks/notify.py
import json
import sys
import os

# Read JSON data from stdin
data = json.load(sys.stdin)

# Get the event type
event = os.environ.get('GATHERHUB_EVENT')

# Log the event
with open(os.environ.get('GATHERHUB_ACTIVITY_LOG', './data/logs/activity.log'), 'a') as f:
    f.write(f"Hook received event: {event}\n")
    f.write(f"Event data: {json.dumps(data, indent=2)}\n")
    f.write("-" * 50 + "\n")

# Exit with success
sys.exit(0)
Important: Remember to make your hook scripts executable (chmod +x your_hook.py).

Bash Example

Here's a simple Bash hook example:

#!/bin/bash
# hooks/example.sh

# Read JSON data from stdin
json_data=$(cat)

# Extract a value from the JSON (requires jq)
url=$(echo "$json_data" | jq -r '.url')

# Log to the activity log
echo "[$(date)] Hook received URL: $url" >> "$GATHERHUB_ACTIVITY_LOG"

# Exit with success
exit 0

Exit Codes

Your hook script should return appropriate exit codes:

  • 0: Success - hook executed successfully
  • Non-zero: Failure - hook encountered an error

If a hook returns a non-zero exit code, GatherHub will log the error but continue processing. Hook failures don't affect the job status.

System Hooks

GatherHub includes built-in system hooks:

Git ZIP Repository Hook

The gatherhub_zip_git_repos.sh hook creates ZIP archives of downloaded Git repositories. It:

  • Runs after Git repository downloads complete
  • Creates a ZIP archive excluding hidden files like .git
  • Places the ZIP file alongside the original repository
  • Makes the repository available for download through the web interface
Warning: This hook requires the zip and jq commands to be installed on your system.

Default Notification Hook

The notify.py hook provides basic activity logging for events. It:

  • Logs all events to the activity log
  • Includes complete event data in JSON format
  • Serves as a useful template for custom notification hooks

Hook Security Considerations

  • Hooks run with the same permissions as the GatherHub process
  • Be careful about what commands your hooks execute
  • Validate and sanitize any data from hook input before using it in commands
  • Consider the security implications of hook actions

Troubleshooting Hooks

Common Issues

Problem Possible Cause Solution
Hook not executing Script not executable Run chmod +x your_hook.sh
Permission denied Insufficient permissions Check file ownership and permissions
Command not found Missing dependencies Install required tools (e.g., jq for JSON parsing)
JSON parsing error Malformed JSON or incorrect parsing Verify your JSON parsing code and input format

Debugging Tips

  1. Add logging to your hook scripts
  2. Check the GatherHub error log for hook execution errors
  3. Test your hook script manually by providing JSON input
  4. Use set -x in Bash scripts to see executed commands
Search Results

Type to search documentation...