2.2. Architecture Overview

2.2.1. What Is a Plugin?

A plugin is a standard Python package containing one or more Celery tasks. Plugins are:

  • Discovered at runtime via Python setuptools entry points

  • Installed into the Ngenea Worker Python virtual environment

  • Executed on Worker nodes as steps within Ngenea Hub workflows

2.2.1.1. Use Cases

Plugins can perform any operation expressible as a Python function operating on a list of file paths:

  • Post-processing hooks — send email notifications, trigger webhooks, update databases

  • Custom file operations — move, copy, rename, or transform files

  • External system integration — call cloud APIs, submit HPC jobs, sync metadata to catalogues

  • Script execution — run shell scripts or external programs on Worker hosts

  • Validation gates — check file integrity, enforce naming conventions before downstream tasks proceed

Danger

Plugin tasks execute with the full permissions of the Worker service (typically root) and have unrestricted access to the entire filesystem. A bug in your plugin code (e.g., an incorrect path variable, an overly broad glob pattern) can delete or overwrite any file on the system, including other users’ data, system configuration, or the Worker’s own state. Thoroughly test all file operations in a non-production environment first, validate all paths, and never use recursive delete without strict path validation.

2.2.2. Where Plugins Fit

The execution flow for a plugin task is:

  1. A user (or schedule/policy) triggers a Workflow via the Ngenea Hub UI or REST API.

  2. Ngenea Hub converts the workflow definition into a DAG (Directed Acyclic Graph) of tasks — a Job containing Task nodes.

  3. Ngenea Hub submits each task to the appropriate Celery queue based on site and queue configuration.

  4. A Ngenea Worker picks up the task, discovers the plugin function via its entry point, and executes it.

  5. The plugin receives a list of paths, processes them, and returns a structured result payload.

  6. Ngenea Hub’s DAG engine receives the result, updates job progress in the UI, and submits downstream tasks.

2.2.3. Key Concepts

Concept

Description

Workflow

A reusable template defining a sequence or graph of tasks, path filters, and runtime fields. Created via the Workflows API.

Job

A single execution instance of a workflow, containing input paths and runtime field values.

Task

A single task node within a Job’s execution graph. Maps to a Celery task invocation.

Filter Rule

A rule within a workflow that matches paths by type and state, then routes them to an ordered chain of task actions.

Field

A runtime parameter defined in a workflow, whose value is provided at execution time and passed to tasks as a keyword argument.

Plugin

A Python package installed on the Worker, discovered via the worker_plugin entry point group.