7.4. Workflow Steps¶
This section documents all the currently supported steps in Ngenea Hub. See Custom Workflows for guidance on how to use these steps in your own workflows.
7.4.1. dynamo.tasks.migrate
¶
Migrates a list of files to a pre-defined remote target using Ngenea.
Argument |
Type |
Default |
Description |
---|---|---|---|
|
|
|
retain the content of every migrated file and do not set the OFFLINE flag for the file.migrating. |
|
|
|
retain a segment of every migrated file starting from its beginning and having a specified approximate length in bytes. |
|
|
|
overwrite remote objects if they already exist–do not create remote object instances with various UUID suffixes |
|
|
|
fail a file migration if a remote object exists but has different hash or metadata. In that case, the task errors |
|
|
|
Defined the locking mode that ngenea will use when performing the migrate |
|
|
specify the endpoint to migrate |
|
|
|
|
allow the migrate task to make any missing files end up in the “aborted” state instead of “failed” |
7.4.2. dynamo.tasks.recall
¶
Recalls a list of files to a pre-defined remote target using Ngenea.
Argument |
Type |
Default |
Description |
---|---|---|---|
|
|
|
If the recall should skip checking the hash of the file |
|
|
specify which endpoint(site) to recall from |
|
|
|
|
Defines the locking level ngenea will use during the recall |
|
|
When a file is recalled, it uses this UID if one is not set on the remote object |
|
|
|
When a file is recalled, it uses this GID if one is not set on the remote object |
|
|
|
|
When a files is recalled, update its access time (atime) to ‘now’ |
|
|
|
When a files is recalled, update its modification time (mtime) to ‘now’ |
|
|
|
If set, when a file is recalled, it deletes the file in the remote location |
7.4.3. dynamo.tasks.reverse_stub
¶
Recalls a list of files to a pre-defined remote target using Ngenea.
Argument |
Type |
Default |
Description |
---|---|---|---|
|
|
|
If the file should be premigrated instead of a regular stub |
|
|
|
The max file size before files will be stubbed for this task |
|
|
|
If the recall should skip checking the hash of the file |
|
|
|
Overwrite local files if they already exist, except files with only metadata changes. |
|
|
specify which endpoint(site) to recall from. |
|
|
|
None |
Controls if the worker should attempt to retry file failures due to stale file handles. This string can be either stub for only removing reverse stubbed files or all. |
|
|
|
Defines the locking level ngenea will use during the recall |
|
|
When a file is recalled, it uses this UID if one is not set on the remote object |
|
|
|
When a file is recalled, it uses this GID if one is not set on the remote object |
|
|
|
|
When a files is recalled, update its access time (atime) to ‘now’ |
|
|
|
When a files is recalled, update its modification time (mtime) to ‘now’ |
|
|
|
Dictates what state the local file should be to pass the check. Options are “newest” which passes if the local file is the latest version of the file on either site, “local” which accepts the local file version regardless of the check and “ignore” which always uses the other sites file version. |
7.4.4. dynamo.tasks.ngenea_sync_metadata
¶
Sync the local ngenea metadata on a file with the remote target using Ngenea.
Argument |
Type |
Default |
Description |
---|---|---|---|
|
|
|
If the sync should skip checking the hash of the file |
|
|
Specify which endpoint(site) to recall from |
|
|
|
When a file is synced, it uses this UID if one is not set on the remote object |
|
|
|
When a file is synced, it uses this GID if one is not set on the remote object |
7.4.5. dynamo.tasks.delete_paths_from_gpfs
¶
Removes a list of files from a GPFS filesystem.
Argument |
Type |
Default |
Description |
---|---|---|---|
|
|
|
If any directory path is provided and this is set, it will remove the entire file tree, otherwise it will only remove empty directories. It is important to note that the recursive behaviour of removing the entire directory tree will not apply to filesets or if the target directory contains files or directories that are restricted from being deleted such as |
7.4.6. dynamo.tasks.check_sync_state
¶
Checks a provided site against the calling sites to ensure that the local file is in a specified state compared to another
site. Using this task will also perform dynamo.tasks.stat_paths
on the provided site before execution.
Argument |
Type |
Default |
Description |
---|---|---|---|
|
|
|
Dictates what state the local file should be to pass the check. Options are “newest” which passes if the local file is the latest version of the file on either site, “local” which accepts the local file version regardless of the check and “ignore” which always uses the other sites file version. |
|
|
The target site to compare |
|
|
|
|
If this bool is set files that do not need to be executed will be marked as aborted as opposed to skipped |
|
|
|
If this bool is set the metadata comparison for directories will also compare Access Control Lists |
7.4.7. dynamo.tasks.move_paths_on_gpfs
¶
Moves files on the filesystem using provided paths with a source
key.
This task moves files in two steps (via an intermediate temporary location), to avoid move conflicts (for example, this task correctly handles cases where files are ‘swapped’: fileA
moved to fileB
, and fileB
moved to fileA
).
Argument |
Type |
Default |
Description |
---|---|---|---|
|
|
|
If set, after a file has been moved all remote location xattrs will be removed |
|
|
|
A signature to send any paths to where the source is missing. If this isn’t set, a missing source is treated as a failure. |
|
|
|
If the target file exists, it is overwritten by default. If this optional parameter is set, the target has to have a ctime <= this timestamp for the move to proceed. Otherwise, the source file is removed. Given in seconds since epoch. Primarily used for sync workflows. |
7.4.8. dynamo.tasks.one_step_move_paths_on_gpfs
¶
Moves files on the filesystem using provided paths with a source
key.
This task is not used in the default Ngenea Hub workflows. It is less robust than dynamo.tasks.move_paths_on_gpfs
, because it moves files directly from source to their new location (in one step). So for example, trying to ‘swap’ files (fileA
moved to fileB
, and fileB
moved to fileA
) with this task will effectively result in one of these files being deleted on the target site.
However, this task is for cases where doing moves in two steps is not an acceptable option.
Argument |
Type |
Default |
Description |
---|---|---|---|
|
|
|
If set, after a file has been moved all remote location xattrs will be removed |
|
|
|
A signature to send any paths to where the source is missing. If this isn’t set, a missing source is treated as a failure. |
|
|
|
If the target file exists, it is overwritten by default. If this optional parameter is set, the target has to have a ctime <= this timestamp for the move to proceed. Otherwise, the source file is removed. Given in seconds since epoch. Primarily used for sync workflows. |
7.4.9. dynamo.tasks.remove_location_xattrs_for_moved
¶
This task removes all remote location xattrs on all provided paths.
This step takes no additional arguments.
7.4.10. dynamo.tasks.move_in_cloud
¶
Moves a file on the filesystem’s related cloud storage platform using provided paths with a source
key.
This step takes no additional arguments.
7.4.11. dynamo.tasks.remove_from_cloud
¶
Deletes a file on the filesystem’s related cloud storage platform using provided paths.
This step takes no additional arguments.
7.4.12. dynamo.tasks.ensure_cloud_file_exists
¶
Ensures all files provided to the task exist on the filesystem’s related cloud storage platform. If some do not, it will attempt to retry this check an additional two more times before failing.
This step takes no additional arguments.