7.4. Workflow Steps¶
The following are all the currently supported steps in Ngenea Hub. All of these steps are ran on the Ngenea Worker provided site on its respective function queue when running a workflow. See Custom Workflows for guidance on how to use these steps in your own workflows.
Each task alongside its name has its function queue detailed, these functions can be controlled through the Ngenea Worker configuration
7.4.1. dynamo.tasks.migrate
- Function: Worker¶
Migrates a list of files to a pre-defined remote target using Ngenea.
Argument |
Type |
Default |
Description |
---|---|---|---|
|
|
|
retain the content of every migrated file and do not set the OFFLINE flag for the file.migrating. |
|
|
|
retain a segment of every migrated file starting from its beginning and having a specified approximate length in bytes. |
|
|
|
overwrite remote objects if they already exist--do not create remote object instances with various UUID suffixes |
|
|
|
fail a file migration if a remote object exists but has different hash or metadata. In that case, the task errors |
|
|
|
Defined the locking mode that ngenea will use when performing the migrate |
|
|
specify the endpoint to migrate |
|
|
|
|
allow the migrate task to make any missing files end up in the "aborted" state instead of "failed" |
|
|
|
If set to True, the migrate task will process offline files in the same manner as regular online files. If set to False, the migrate task will process the offline files using |
7.4.2. dynamo.tasks.recall
- Function: Worker¶
Recalls a list of files to a pre-defined remote target using Ngenea.
Argument |
Type |
Default |
Description |
---|---|---|---|
|
|
|
If the recall should skip checking the hash of the file |
|
|
specify which endpoint(site) to recall from |
|
|
|
|
Defines the locking level ngenea will use during the recall |
|
|
When a file is recalled, it uses this UID if one is not set on the remote object |
|
|
|
When a file is recalled, it uses this GID if one is not set on the remote object |
|
|
|
|
When a files is recalled, update its access time (atime) to 'now' |
|
|
|
When a files is recalled, update its modification time (mtime) to 'now' |
|
|
|
If set, when a file is recalled, it deletes the file in the remote location |
7.4.3. dynamo.tasks.reverse_stub
- Function: Worker¶
Recalls a list of files to a pre-defined remote target using Ngenea.
Argument |
Type |
Default |
Description |
---|---|---|---|
|
|
|
If the file should be premigrated instead of a regular stub |
|
|
|
The max file size before files will be stubbed for this task |
|
|
|
If the recall should skip checking the hash of the file |
|
|
|
Overwrite local files if they already exist, except files with only metadata changes. |
|
|
specify which endpoint(site) to recall from. |
|
|
|
None |
Controls if the worker should attempt to retry file failures due to stale file handles. This string can be either stub for only removing reverse stubbed files or all. |
|
|
|
Defines the locking level ngenea will use during the recall |
|
|
When a file is recalled, it uses this UID if one is not set on the remote object |
|
|
|
When a file is recalled, it uses this GID if one is not set on the remote object |
|
|
|
|
When a files is recalled, update its access time (atime) to 'now' |
|
|
|
When a files is recalled, update its modification time (mtime) to 'now' |
|
|
|
Dictates what state the local file should be to pass the check. Options are "newest" which passes if the local file is the latest version of the file on either site, "local" which accepts the local file version regardless of the check and "ignore" which always uses the other sites file version. |
7.4.4. dynamo.tasks.ngenea_sync_metadata
- Function: Worker¶
Sync the local ngenea metadata on a file with the remote target using Ngenea.
Argument |
Type |
Default |
Description |
---|---|---|---|
|
|
|
If the sync should skip checking the hash of the file |
|
|
Specify which endpoint(site) to recall from |
|
|
|
When a file is synced, it uses this UID if one is not set on the remote object |
|
|
|
When a file is synced, it uses this GID if one is not set on the remote object |
7.4.5. dynamo.tasks.delete_paths_from_gpfs
- Function: Worker¶
Removes a list of files from a GPFS filesystem.
Argument |
Type |
Default |
Description |
---|---|---|---|
|
|
|
If any directory path is provided and this is set, it will remove the entire file tree, otherwise it will only remove empty directories. It is important to note that the recursive behaviour of removing the entire directory tree will not apply to filesets or if the target directory contains files or directories that are restricted from being deleted such as |
7.4.6. dynamo.tasks.check_sync_state
- Function: Worker¶
Checks a provided site against the calling sites to ensure that the local file is in a specified state compared to another
site. Using this task will also perform dynamo.tasks.stat_paths
on the provided site before execution.
Argument |
Type |
Default |
Description |
---|---|---|---|
|
|
|
Dictates what state the local file should be to pass the check. Options are "newest" which passes if the local file is the latest version of the file on either site, "local" which accepts the local file version regardless of the check and "ignore" which always uses the other sites file version. |
|
|
The target site to compare |
|
|
|
|
If this bool is set files that do not need to be executed will be marked as aborted as opposed to skipped |
|
|
|
If this bool is set the metadata comparison for directories will also compare Access Control Lists |
7.4.7. dynamo.tasks.move_paths_on_gpfs
- Function: Worker¶
Moves files on the filesystem using provided paths with a source
key.
This task moves files in two steps (via an intermediate temporary location), to avoid move conflicts (for example, this task correctly handles cases where files are 'swapped': fileA
moved to fileB
, and fileB
moved to fileA
).
Argument |
Type |
Default |
Description |
---|---|---|---|
|
|
|
If set, after a file has been moved all remote location xattrs will be removed |
|
|
|
A signature to send any paths to where the source is missing. If this isn't set, a missing source is treated as a failure. |
|
|
|
If the target file exists, it is overwritten by default. If this optional parameter is set, the target has to have a ctime <= this timestamp for the move to proceed. Otherwise, the source file is removed. Given in seconds since epoch. Primarily used for sync workflows. |
7.4.8. dynamo.tasks.one_step_move_paths_on_gpfs
- Function: Worker¶
Moves files on the filesystem using provided paths with a source
key.
This task is not used in the default Ngenea Hub workflows. It is less robust than dynamo.tasks.move_paths_on_gpfs
, because it moves files directly from source to their new location (in one step). So for example, trying to 'swap' files (fileA
moved to fileB
, and fileB
moved to fileA
) with this task will effectively result in one of these files being deleted on the target site.
However, this task is for cases where doing moves in two steps is not an acceptable option.
Argument |
Type |
Default |
Description |
---|---|---|---|
|
|
|
If set, after a file has been moved all remote location xattrs will be removed |
|
|
|
A signature to send any paths to where the source is missing. If this isn't set, a missing source is treated as a failure. |
|
|
|
If the target file exists, it is overwritten by default. If this optional parameter is set, the target has to have a ctime <= this timestamp for the move to proceed. Otherwise, the source file is removed. Given in seconds since epoch. Primarily used for sync workflows. |
7.4.9. dynamo.tasks.remove_location_xattrs_for_moved
- Function: Worker¶
This task removes all remote location xattrs on all provided paths.
This step takes no additional arguments.
7.4.10. dynamo.tasks.move_in_cloud
- Function: Worker¶
Moves a file on the filesystem's related cloud storage platform using provided paths with a source
key.
This step takes no additional arguments.
7.4.11. dynamo.tasks.remove_from_cloud
- Function: Worker¶
Deletes a file on the filesystem's related cloud storage platform using provided paths.
This step takes no additional arguments.
7.4.12. dynamo.tasks.ensure_cloud_file_exists
- Function: Worker¶
Ensures all files provided to the task exist on the filesystem's related cloud storage platform. If some do not, it will attempt to retry this check an additional two more times before failing.
This step takes no additional arguments.
7.4.13. dynamo.tasks.import_files_from_external_target
- Function: Worker¶
Recalls a list of files from an external target to a predefined local object using Ngenea. The input files must be
a list of remote object paths not local file paths. (eg, projects/
)
Argument |
Type |
Default |
Description |
---|---|---|---|
|
|
Specify which endpoint(ngenea target name) to recall from |
|
|
|
|
Defines the locking level ngenea will use during the recall |
|
|
|
Specify an absolute path for where to place the file/folder in the local filesystem |
|
|
|
If the remote object should be premigrated instead of a regular stub |
|
|
List of extra arguments that can be passed to recall command |
|
|
|
|
If the recall should skip checking the hash of the file |
|
|
When a file is recalled, it uses this UID if one is not set on the remote object |
|
|
|
When a file is recalled, it uses this GID if one is not set on the remote object |
|
|
|
|
When a files is recalled, update its access time (atime) to 'now' |
|
|
|
When a files is recalled, update its modification time (mtime) to 'now' |
7.4.14. dynamo.tasks.copy_files_from_external_target
¶
Copies a list of files from an external target to a predefined local location using Ngenea. The input files should be provided as a list of remote object paths, not local file paths (e.g., projects/). Once copied, these files are no longer associated with the target.
Argument |
Type |
Default |
Description |
---|---|---|---|
|
|
Specify which endpoint(ngenea target name) to recall from |
|
|
|
|
Defines the locking level ngenea will use during the recall |
|
|
|
Specify an absolute path for where to place the file/folder in the local filesystem |
|
|
|
When a file is recalled, it uses this UID if one is not set on the remote object |
|
|
|
When a file is recalled, it uses this GID if one is not set on the remote object |
|
|
List of extra arguments that can be passed to recall command |
|
|
|
|
Do not read user, group and timestamps etc, from data to be copied |
7.4.15. dynamo.tasks.import_bytes
- Function: Worker¶
Recalls a specific byte range of a single file from an ngenea target to a predefined local location using Ngenea. The input files should be provided as a list of remote object paths (e.g., projects/), and the recall operation will be based on the byte-level data of these files, rather than entire files.
Argument |
Type |
Default |
Description |
---|---|---|---|
|
|
Specify which endpoint(ngenea target name) to recall from |
|
|
|
|
Defines the locking level ngenea will use during the recall |
|
|
|
Specify an absolute path for where to place the file/folder in the local filesystem |
|
|
|
Beginning offset of a segment of a remote object to download |
|
|
|
Ending offset of a segment of a remote object to download |
|
|
When a file is recalled, it uses this UID if one is not set on the remote object |
|
|
|
When a file is recalled, it uses this GID if one is not set on the remote object |
|
|
|
Specify mode bits (in octal format) to set for stubbed/premigrated files if there are no mode bits associated with remote objects |
|
|
|
Specify mode bits (in octal format) to set for directories created locally if there are no mode bits associated with remote objects |
Warning
This task is incompatible with recursive discovery.
7.4.16. dynamo.tasks.recall_bytes
- Function: Worker¶
Recalls a specific byte range of a single local file present in an ngenea target to a predefined local location using Ngenea. The input should be a list of local stubbed file paths, and the recall will only retrieve the byte-level data of those files, rather than entire files.
Argument |
Type |
Default |
Description |
---|---|---|---|
|
|
|
Defines the locking level ngenea will use during the recall |
|
|
|
Specify an absolute path for where to place the file/folder in the local filesystem |
|
|
|
Beginning offset of a segment of a remote object to download |
|
|
|
Ending offset of a segment of a remote object to download |
|
|
When a file is recalled, it uses this UID if one is not set on the remote object |
|
|
|
When a file is recalled, it uses this GID if one is not set on the remote object |
|
|
|
Specify mode bits (in octal format) to set for stubbed/premigrated files if there are no mode bits associated with remote objects |
|
|
|
Specify mode bits (in octal format) to set for directories created locally if there are no mode bits associated with remote objects |
Warning
This task is incompatible with recursive discovery.