🪝 Developing User Hooks
Hooks allow you to inject custom processing into the fetchez pipeline. You can write hooks to process files immediately after they are downloaded, or to run setup/teardown tasks.
How it Works
fetchez scans ~/.fetchez/hooks/ at runtime.
It registers any class that inherits from fetchez.hooks.FetchHook.
Example Hook
Create a file named ~/.fetchez/hooks/audit_log.py to log every download to a file:
import os
from fetchez.hooks import FetchHook
class AuditLog(FetchHook):
# This name is used in the CLI: --hook audit
name = "audit"
meta_desc = "Log downloaded files to audit.txt"
meta_stage = 'file' # Runs per-file
def run(self, entries):
# Hooks receive a list of entries: [{url, path, type, status}, ...]
for entry in entries:
url = entry.get('url')
path = entry.get('dst_fn')
status = entry.get('status')
if status == 0:
with open("audit.txt", "a") as f:
f.write(f"DOWNLOADED: {path} FROM {url}\n")
# Always return the entries so the pipeline continues!
return entries
Testing Your Hook
# Check if it loaded
fetchez --list-hooks
# Run it
fetchez srtm_plus --hook audit
🔗 Sharing a Plugin or Hook
Did you build a plugin that would be useful for the wider community? We’d love to incorporate it!
Submit a Pull Request adding your file to fetchez/modules/ or fetchez/hooks.
Developing & Sharing Presets
Presets (Macros) are the easiest way to share complex data engineering workflows without writing Python code. They allow you to bundle multiple processing steps into a single, shareable YAML snippet.
The Preset Structure
Presets are standalone .yaml files placed in ~/.fetchez/presets/. You can quickly see all active presets on your system by running fetchez --list-presets.
A preset requires a name, a description, and a list of hooks and their arguments.
~/.fetchez/presets/clean_download.yaml
name: clean-download
description: "Unzip files and remove the original archive."
hooks:
- name: unzip
args:
remove: true
Module-Specific Overrides
Sometimes you build a macro that is only relevant to one specific dataset. For example, if you use the multibeam module, you might want a shortcut to only download the .inf metadata files.
You can restrict a preset so it only appears in the CLI menu for a specific module by adding the target_module key!
~/.fetchez/presets/inf_only.yaml
name: inf_only
target_module: multibeam
description: 'multibeam Only: Fetch only inf files'
hooks:
- name: filename_filter
args:
match: '.inf'
stage: manifest
Now, --inf-only will show up when you run fetchez multibeam --help, but it won’t clutter the global menu!
Best Practices for Sharing
If you have developed a robust workflow (e.g., “Standard Archival Prep” or “Cloud Optimized GeoTIFF Conversion”), you can share it easily!
Share the YAML: You can post your .yaml file in a GitHub Issue, a Gist, or on our Zulip chat. Users just drop it into their ~/.fetchez/presets/ folder.
Bundle in a Python Package: If you are building a Python package that extends fetchez (like globato), you can distribute presets automatically! Just place your YAML files in a package directory and register them in your pyproject.toml:
[project.entry-points."fetchez.presets"]
my_custom_presets = "my_package.presets"
Contribute to Core: If a preset is universally useful, you can propose adding it directly to the fetchez/presets/ core directory via a Pull Request.