Job Management

GatherHub processes all download requests as "jobs" - tracked items in a queue system that can be monitored, prioritized, and managed.

Job Lifecycle

Each job in GatherHub follows a specific lifecycle through various states:

pending Pending

The job has been created but is not yet being processed. Pending jobs are waiting in the queue for the downloader to pick them up. You can add tags, adjust priority, or delete jobs while they are in the pending state.

downloading Downloading

The job is currently being processed by the downloader. GatherHub is actively downloading the content. Jobs in this state will show progress information where available. Downloading jobs cannot be modified until they complete or fail. If a job appears stuck in this state, it can be reset using the "Reset Stuck Jobs" function in the web interface or API.

completed Completed

The job has successfully finished downloading. The content is now available in the appropriate directory based on its media type. Completed jobs can be viewed, tagged, but not redownloaded unless explicitly reset.

failed Failed

The job encountered an error during download. Error details are stored with the job. Failed jobs can be retried, edited, or deleted. Common causes of failure include temporary network issues, removed content, or permission restrictions.

paused Paused

The job has been manually paused by the user. Paused jobs are skipped during regular processing until they are resumed. This is useful for temporarily delaying large downloads or non-urgent content.

queued Queued

The job is actively in the download queue and will be processed soon. This is an intermediate state between pending and downloading, indicating that the job is next in line for processing.

Job Properties
  • ID Unique identifier
  • URL Source URL
  • Media Type Content format
  • Status Current state
  • Progress Download percentage
  • File Path Storage location
  • Created At Job creation time
  • Updated At Last status change
  • Error Error message if failed
  • Retry Count Number of retry attempts
  • Priority Download priority (1-10)
  • Title Content title
  • File Name Name of saved file
  • Source ID Unique identifier from source
  • Source Name Origin (browser, RSS, etc.)
  • Last Visit Date When content was last visited
  • Tags User-defined labels

Tagging System

GatherHub provides a flexible tagging system to help you organize and find your archived content. Tags are user-defined labels that can be attached to any job.

Working with Tags

To add a tag to a job:

  1. Navigate to the job details page
  2. Enter the tag name in the "Add Tag" field
  3. Click "Add Tag" or press Enter

To remove a tag from a job:

  1. Navigate to the job details page
  2. Click the X icon next to the tag you want to remove

To filter jobs by tag:

  1. Go to the Jobs page
  2. Click on a tag in the tags filter section
  3. The job list will update to show only jobs with that tag
Tag Features
  • Auto-tagging: GatherHub can automatically apply tags based on URL patterns and content type.
  • Bulk tagging: Apply or remove tags from multiple jobs at once from the Jobs page.
  • Tag hierarchy: Use slashes to create hierarchical tags (e.g., "videos/tutorials").
  • Tag colors: Customize tag colors for visual organization in the Settings page.
  • Tag statistics: View statistics on the most common tags and content distribution.
  • Tag search: Search for specific tags or combinations of tags.
  • Tag normalization: Tags are automatically converted to lowercase and trimmed for consistency.
  • Tag intersections: Search for content that contains multiple specific tags simultaneously.

Search Capabilities

GatherHub provides powerful search functionality to help you find and manage your jobs effectively.

Search Criteria
  • Text search: Search in URLs and titles
  • Status filtering: Filter jobs by their current status
  • Media type filtering: Find jobs of a specific content type
  • Source filtering: Filter by content origin
  • Tag filtering: Search for jobs with specific tags
  • Combined search: Use multiple criteria simultaneously for precise results
Advanced Search Features
  • Tag intersections: Find jobs that have all specified tags
  • Pagination: Navigate through large result sets
  • Sorting options: Order results by creation date, update date, or ID
  • Search history: Recent searches are saved for quick access
  • Bulk actions: Perform operations on search results

Job Indexing and Content Extraction

Jobs that have been successfully completed can be indexed for content search and analysis.

When a job is completed, its content may be automatically processed for indexing:

  • Content extraction: Text and metadata are extracted from downloaded files
  • Search indexing: Content is added to the search index for full-text search
  • Metadata enrichment: Additional information is extracted and stored with the job
  • File type detection: Content is categorized based on file type analysis

Indexed content enables powerful search capabilities across your entire archive, allowing you to find information by content rather than just by job properties.

Job Management API

For developers and advanced users, GatherHub provides a comprehensive API for programmatic job management.

API Operations
  • Create jobs: Add new download jobs programmatically
  • Update status: Change job status (pause, resume, cancel)
  • Modify metadata: Update job properties and custom metadata
  • Tag management: Add or remove tags from jobs
  • Batch operations: Process multiple jobs in a single request
  • Search and filter: Find jobs matching specific criteria
Activity Logging

All job operations are recorded in the activity log, which tracks:

  • Status changes: When a job transitions between states
  • File updates: Changes to file paths or file names
  • Metadata changes: Updates to job metadata
  • Tag changes: When tags are added or removed
  • Error events: Recording of failure reasons

The activity log provides a complete audit trail of all operations performed on jobs, which is valuable for troubleshooting and understanding system behavior.

Job Management Features

Queue Control

Manage the download queue with controls for pausing, resuming, prioritizing, and canceling jobs. The queue can be manually sorted or automatically prioritized based on criteria like age, size, or media type.

Batch Operations

Perform actions on multiple jobs at once, including tagging, status changes, priorities, and deletion. Use the checkbox selectors on the Jobs page to select jobs for batch operations.

Job Analytics

View comprehensive statistics on your download history, including total storage used, success rates, most common content types, and trends over time. The analytics dashboard provides:

  • Status distribution (pending, completed, failed, etc.)
  • Media type breakdown
  • Source analysis
  • Tag frequency and relationships
  • Historical performance metrics
  • Content indexing statistics
Tip: You can quickly access job management features by using the action icons in the job list, or by right-clicking on a job to access the context menu with additional options.
Search Results

Type to search documentation...