2.2. Architecture Overview¶
2.2.1. What Is a Plugin?¶
A plugin is a standard Python package containing one or more Celery tasks. Plugins are:
Discovered at runtime via Python setuptools entry points
Installed into the Ngenea Worker Python virtual environment
Executed on Worker nodes as steps within Ngenea Hub workflows
2.2.1.1. Use Cases¶
Plugins can perform any operation expressible as a Python function operating on a list of file paths:
Post-processing hooks — send email notifications, trigger webhooks, update databases
Custom file operations — move, copy, rename, or transform files
External system integration — call cloud APIs, submit HPC jobs, sync metadata to catalogues
Script execution — run shell scripts or external programs on Worker hosts
Validation gates — check file integrity, enforce naming conventions before downstream tasks proceed
Danger
Plugin tasks execute with the full permissions of the Worker service (typically root) and have unrestricted access to the entire filesystem. A bug in your plugin code (e.g., an incorrect path variable, an overly broad glob pattern) can delete or overwrite any file on the system, including other users’ data, system configuration, or the Worker’s own state. Thoroughly test all file operations in a non-production environment first, validate all paths, and never use recursive delete without strict path validation.
2.2.2. Where Plugins Fit¶
The execution flow for a plugin task is:
A user (or schedule/policy) triggers a Workflow via the Ngenea Hub UI or REST API.
Ngenea Hub converts the workflow definition into a DAG (Directed Acyclic Graph) of tasks — a Job containing Task nodes.
Ngenea Hub submits each task to the appropriate Celery queue based on site and queue configuration.
A Ngenea Worker picks up the task, discovers the plugin function via its entry point, and executes it.
The plugin receives a list of paths, processes them, and returns a structured result payload.
Ngenea Hub’s DAG engine receives the result, updates job progress in the UI, and submits downstream tasks.
2.2.3. Key Concepts¶
Concept |
Description |
|---|---|
Workflow |
A reusable template defining a sequence or graph of tasks, path filters, and runtime fields. Created via the Workflows API. |
Job |
A single execution instance of a workflow, containing input paths and runtime field values. |
Task |
A single task node within a Job’s execution graph. Maps to a Celery task invocation. |
Filter Rule |
A rule within a workflow that matches paths by type and state, then routes them to an ordered chain of task actions. |
Field |
A runtime parameter defined in a workflow, whose value is provided at execution time and passed to tasks as a keyword argument. |
Plugin |
A Python package installed on the Worker, discovered via the |