7.4. Workflow Steps

This section documents all the currently supported steps in Ngenea Hub. See Custom Workflows for guidance on how to use these steps in your own workflows.

7.4.1. dynamo.tasks.migrate

Migrates a list of files to a pre-defined remote target using Ngenea.

Argument

Type

Default

Description

premigrate

bool

False

retain the content of every migrated file and do not set the OFFLINE flag for the file.migrating.

stub_size

int

0

retain a segment of every migrated file starting from its beginning and having a specified approximate length in bytes.

overwrite

bool

False

overwrite remote objects if they already exist--do not create remote object instances with various UUID suffixes

fail_on_mismatch

bool

False

fail a file migration if a remote object exists but has different hash or metadata. In that case, the task errors

lock_level

string

implicit

Defined the locking mode that ngenea will use when performing the migrate

endpoint

string

specify the endpoint to migrate

abort_missing

bool

False

allow the migrate task to make any missing files end up in the "aborted" state instead of "failed"

migrate_offline_files

bool

False

If set to True, the migrate task will process offline files in the same manner as regular online files. If set to False, the migrate task will process the offline files using --sync-metadata

7.4.2. dynamo.tasks.recall

Recalls a list of files to a pre-defined remote target using Ngenea.

Argument

Type

Default

Description

skip_hash

bool

False

If the recall should skip checking the hash of the file

endpoint

string

specify which endpoint(site) to recall from

lock_level

string

partial

Defines the locking level ngenea will use during the recall

default_uid

string

When a file is recalled, it uses this UID if one is not set on the remote object

default_gid

string

When a file is recalled, it uses this GID if one is not set on the remote object

update_atime

bool

false

When a files is recalled, update its access time (atime) to 'now'

update_mtime

bool

false

When a files is recalled, update its modification time (mtime) to 'now'

delete_remote

bool

false

If set, when a file is recalled, it deletes the file in the remote location

7.4.3. dynamo.tasks.reverse_stub

Recalls a list of files to a pre-defined remote target using Ngenea.

Argument

Type

Default

Description

hydrate

bool

False

If the file should be premigrated instead of a regular stub

stub_size

int

0

The max file size before files will be stubbed for this task

skip_hash

bool

False

If the recall should skip checking the hash of the file

overwrite

bool

False

Overwrite local files if they already exist, except files with only metadata changes.

endpoint

string

specify which endpoint(site) to recall from.

retry_stale

string

None

Controls if the worker should attempt to retry file failures due to stale file handles. This string can be either stub for only removing reverse stubbed files or all.

lock_level

string

implicit

Defines the locking level ngenea will use during the recall

default_uid

string

When a file is recalled, it uses this UID if one is not set on the remote object

default_gid

string

When a file is recalled, it uses this GID if one is not set on the remote object

update_atime

bool

false

When a files is recalled, update its access time (atime) to 'now'

update_mtime

bool

false

When a files is recalled, update its modification time (mtime) to 'now'

conflict_preference

string

None

Dictates what state the local file should be to pass the check. Options are "newest" which passes if the local file is the latest version of the file on either site, "local" which accepts the local file version regardless of the check and "ignore" which always uses the other sites file version.

7.4.4. dynamo.tasks.ngenea_sync_metadata

Sync the local ngenea metadata on a file with the remote target using Ngenea.

Argument

Type

Default

Description

skip_hash

bool

False

If the sync should skip checking the hash of the file

endpoint

string

Specify which endpoint(site) to recall from

default_uid

string

When a file is synced, it uses this UID if one is not set on the remote object

default_gid

string

When a file is synced, it uses this GID if one is not set on the remote object

7.4.5. dynamo.tasks.delete_paths_from_gpfs

Removes a list of files from a GPFS filesystem.

Argument

Type

Default

Description

recursive

bool

False

If any directory path is provided and this is set, it will remove the entire file tree, otherwise it will only remove empty directories. It is important to note that the recursive behaviour of removing the entire directory tree will not apply to filesets or if the target directory contains files or directories that are restricted from being deleted such as snapshots, in such cases the task will silently ignore those files or directories and report the task as successful.

7.4.6. dynamo.tasks.check_sync_state

Checks a provided site against the calling sites to ensure that the local file is in a specified state compared to another site. Using this task will also perform dynamo.tasks.stat_paths on the provided site before execution.

Argument

Type

Default

Description

sync_preference

string

ignore

Dictates what state the local file should be to pass the check. Options are "newest" which passes if the local file is the latest version of the file on either site, "local" which accepts the local file version regardless of the check and "ignore" which always uses the other sites file version.

site

string

The target site to compare

abort_outdated

bool

False

If this bool is set files that do not need to be executed will be marked as aborted as opposed to skipped

hash_includes_acl

bool

False

If this bool is set the metadata comparison for directories will also compare Access Control Lists

7.4.7. dynamo.tasks.move_paths_on_gpfs

Moves files on the filesystem using provided paths with a source key.

This task moves files in two steps (via an intermediate temporary location), to avoid move conflicts (for example, this task correctly handles cases where files are 'swapped': fileA moved to fileB, and fileB moved to fileA).

Argument

Type

Default

Description

delete_remote_xattrs

bool

False

If set, after a file has been moved all remote location xattrs will be removed

source_missing_signature

json

null

A signature to send any paths to where the source is missing. If this isn't set, a missing source is treated as a failure.

target_max_age

int

null

If the target file exists, it is overwritten by default. If this optional parameter is set, the target has to have a ctime <= this timestamp for the move to proceed. Otherwise, the source file is removed. Given in seconds since epoch. Primarily used for sync workflows.

7.4.8. dynamo.tasks.one_step_move_paths_on_gpfs

Moves files on the filesystem using provided paths with a source key.

This task is not used in the default Ngenea Hub workflows. It is less robust than dynamo.tasks.move_paths_on_gpfs, because it moves files directly from source to their new location (in one step). So for example, trying to 'swap' files (fileA moved to fileB, and fileB moved to fileA) with this task will effectively result in one of these files being deleted on the target site.

However, this task is for cases where doing moves in two steps is not an acceptable option.

Argument

Type

Default

Description

delete_remote_xattrs

bool

False

If set, after a file has been moved all remote location xattrs will be removed

source_missing_signature

json

null

A signature to send any paths to where the source is missing. If this isn't set, a missing source is treated as a failure.

target_max_age

int

null

If the target file exists, it is overwritten by default. If this optional parameter is set, the target has to have a ctime <= this timestamp for the move to proceed. Otherwise, the source file is removed. Given in seconds since epoch. Primarily used for sync workflows.

7.4.9. dynamo.tasks.remove_location_xattrs_for_moved

This task removes all remote location xattrs on all provided paths.

This step takes no additional arguments.

7.4.10. dynamo.tasks.move_in_cloud

Moves a file on the filesystem's related cloud storage platform using provided paths with a source key.

This step takes no additional arguments.

7.4.11. dynamo.tasks.remove_from_cloud

Deletes a file on the filesystem's related cloud storage platform using provided paths.

This step takes no additional arguments.

7.4.12. dynamo.tasks.ensure_cloud_file_exists

Ensures all files provided to the task exist on the filesystem's related cloud storage platform. If some do not, it will attempt to retry this check an additional two more times before failing.

This step takes no additional arguments.

7.4.13. dynamo.tasks.import_files_from_external_target

Recalls a list of files from external target to a predefined local object using Ngenea. The input files must be a list of remote object paths not local file paths. (eg, apsearch/)

Argument

Type

Default

Description

endpoint

string

Specify which endpoint(ngenea target name) to recall from

lock_level

string

implicit

Defines the locking level ngenea will use during the recall

location

string

null

Specify an absolute path for where to place the file/folder in the local filesystem

hydrate

bool

false

If the remote object should be premigrated instead of a regular stub

extra_flags

list

List of extra arguments that can be passed to recall command

skip_hash

bool

false

If the recall should skip checking the hash of the file

default_uid

string

When a file is recalled, it uses this UID if one is not set on the remote object

default_gid

string

When a file is recalled, it uses this GID if one is not set on the remote object

update_atime

bool

false

When a files is recalled, update its access time (atime) to 'now'

update_mtime

bool

false

When a files is recalled, update its modification time (mtime) to 'now'