πŸͺ Hooks and Presets#

Fetchez is designed to be highly extendable. Instead of just downloading files, you can build automated pipelines that process data on the fly.

Processing Hooks#

Fetchez includes a powerful Hook System that allows you to chain actions together. Hooks run in a pipeline, meaning the output of one hook (e.g., unzipping a file) becomes the input for the next (e.g., streaming and processing it).

There are three stages in the Hook lifecycle:

  1. PRE/MANIFEST Stage: (pre stage) Runs before any data is downloaded (e.g., filtering URLs, masking regions).

  2. FILE Stage: Runs on each individual file as it is downloaded (e.g., unzipping, converting formats, or piping to stdout).

  3. POST/COLLECTION Stage: (post stage) Runs after all files are downloaded (e.g., merging grids, calculating checksums).

Each hook defines it’s default stage, which can be changed at any time.

Common Built-in Hooks:#

  • unzip: Automatically extracts .zip or .gz files.

  • pipe: Prints the final absolute path to stdout (useful for piping to GDAL/PDAL).

  • audit: Generates a JSON manifest of everything downloaded and processed.

  • exec: Run a shell command on a file (uses β€œ{file}” formatter).

Example (CLI):#

# Download data.zip
# Extract data.tif (via unzip hook)
# Print /path/to/data.tif (via pipe hook)
fetchez charts --hook unzip --hook pipe

# warp the copernicus files right when their downloaded
fetchez -R loc:denver copernicus --pipe | xargs gdalwarp -t_srs EPSG:3857

# build a vrt of the fetched files
gdalbuildvrt cop_merged.vrt $(fetchez -R -105/-104/39/40 copernicus --pipe)

Pipeline Presets (Macros)#

Tired of typing the same chain of hooks every time? Presets allow you to define reusable workflow macros.

Instead of running this long command:

fetchez copernicus --hook checksum:algo=sha256 --hook enrich --hook audit:file=log.json

You can define a preset and simply run:

fetchez copernicus --audit-full

How to create a Preset:#

Presets are simply YAML files that live in your ~/.fetchez/presets/ directory. fetchez automatically scans this folder and turns any valid YAML file into a CLI flag!

  1. Create a file: ~/.fetchez/presets/audit_full.yaml

  2. Define your workflow:

name: audit-full
description: Generate SHA256 hashes, enrichment, and a full JSON audit logs.
hooks:
  - name: checksum
    args:
      algo: sha256
  - name: enrich
  - name: audit
    args:
      file: audit_full.json

Run it: Your new preset automatically appears as a CLI flag in fetchez!

fetchez charts --audit-full

Extending Hooks and Presets (Plugins and Extensions)#

Fetchez is generic. If you are building a custom tool and want to create your own processing hooks and presets, you can register your own hooks and presets either in your project or in the .fetchez configuration directory and they will be discoverable with the fetchez.registry.HookRegistry and fetchez.registry.PresetRegistry

In your project, make a directory called β€˜hooks’ or β€˜presets’; add any python hooks and presets to the appropriate directory and register them with fetchez in your pyproject.toml:

[project.entry-points."fetchez.hooks"]
my_project_hooks = "my_project.hooks"
[project.entry-points."fetchez.presets"]
my_project_presetes = "my_project.presets"