GatherHub has special support for downloading and archiving Git repositories. This page details how Git repositories are handled, security considerations, and troubleshooting steps.
The following URL patterns are automatically detected as Git repositories:
Pattern | Example |
---|---|
GitHub repositories | https://github.com/username/repository |
GitLab repositories | https://gitlab.com/username/repository |
Bitbucket repositories | https://bitbucket.org/username/repository |
Any URL ending with .git | https://example.org/repo.git |
.git
. For example, https://github.com/username/repository
will be recognized as a Git repository.
For security reasons, GatherHub is designed to operate without using Git credentials. This means:
The following security measures are implemented for Git operations:
GIT_CONFIG_NOGLOBAL=1
: Prevents git from using global configurationGIT_TERMINAL_PROMPT=0
: Prevents git from prompting for credentialsGIT_ASKPASS=/bin/echo
: Makes git fail instead of prompting for passwords-c credential.helper=
: Temporarily disables any configured credential helpers--depth=1
: Creates a shallow clone to minimize data transfer--shallow-submodules
: Also creates shallow clones of submodules--single-branch
: Clones only the default branch--filter=blob:none
: Optimizes the clone for faster operations
Git repositories are stored in the directory specified by the storage.by_type.git
configuration setting,
which defaults to ./downloads/git/
.
When a repository is cloned, it's saved in a directory named according to its source:
[storage.base_path]/[storage.by_type.git]/[username]_[repository]
For example, a GitHub repository at https://github.com/user/project
would be saved as:
./downloads/git/user_project/
After a Git repository is successfully cloned, GatherHub automatically creates a ZIP archive for easy download through the web interface.
This is handled by the gatherhub_zip_git_repos.sh
hook script.
zip
command is installed on your system and the hook script is properly configured.
The ZIP creation is implemented as a post-download hook. You can find this script at:
data/hooks/gatherhub_zip_git_repos.sh
This script is automatically registered to run after Git repository downloads. It will:
If Git downloads get stuck in the "downloading" state, it's usually because the system is waiting for authentication which won't be provided due to the credential-free design.
To recover from stuck downloads:
Use the unstuck jobs tool:
./data/tools/unstuck_jobs.sh
This will reset any stuck jobs to "failed" status so they can be retried.
You can also use the web interface to reset stuck jobs by clicking the "Reset Stuck Jobs" button.
Or use the API endpoint:
curl -X POST http://localhost:5000/api/reset-stuck
Problem | Possible Cause | Solution |
---|---|---|
Authentication failed | Attempted to clone a private repository | Only use public repositories with GatherHub |
Permission denied | Insufficient permissions to write to the downloads directory | Ensure the user running GatherHub has write permissions to the downloads directory |
No ZIP file created | Missing zip command or hook script not executed | Verify that zip is installed and check the hooks configuration |
To diagnose git-related issues, check the error log:
grep -i git data/logs/error.log
For authentication issues specifically:
grep -i credential data/logs/error.log
Before marking a Git repository download as complete, GatherHub performs these verification steps:
To manually test Git repository handling: