5.10. Configuration

5.10.1. Global Configurations

Some configurations are stored in the Ngenea Hub configuration file, as described in Hub Configuration. These are generally static, or sensitive settings. Changes to these settings require a service restart.

In addition, there are configurations which can be changed on-the-fly, typically to change Ngenea Hub behaviour. These settings can be viewed and changed via the REST API, as described below.

5.10.1.1. Available Settings

Name

Description

Default

jobs_ttl

How long job details should be stored after job completion, in days.

90

search_backend

Backend to use when performing searches. Currently supported backends: analytics, pixstor_search.

pixstor_search

search_result_ttl

How long search results should be stored, in days.

7

search_max_results

Maximum number of search results to fetch from the search backend, per site. Fetching more results will make queries slower and will require more storage space. Fetching fewer results may lead to some files being missing. Note, some backends have a hard limit of 10,000 results.

200

snapshot_create_delete_retry_timeout

Time in seconds after which workers will give up retrying snapshot create and delete operations. These occur in the dynamo.tasks.snapdiff and dynamo.tasks.rotate_snapshot tasks. The default 1000s will give 4 or more retries. Set to 0 to disable retries.

1000

stat_timeout

How long to wait for results from the /api/file/ endpoint, in seconds.

10

stat_refresh_period

How long a files details will be retained as a cache within the file browser, in seconds.

10

task_invalidation_timeout

How long in minutes for a task in the STARTED state will wait before being invalidated.

360

snapdiff_stream_timeout

Idle timeout when waiting for delete and move tasks to complete during snapdiff workflows, in minutes.

10

force_local_managed_users

Allow local NAS users to be managed on AD joined sites.

False

custom_statistics_tasks

A list containing the names of custom tasks that add their number of reported files to the job total, it will do so regardless of when these tasks are executed within a workflow.

["dynamo.custom.find_all_mp4_files", "dynamo.custom.find_all_png_files"]

maximum_external_results

Maximum number of items to retrieve per page from an external target scan.

100

Note

If the file details are not returned for large fileset, increase the default value of stat_refresh_period in the configuration tab of hub admin page http://:/admin or do a patch on configuration API.

5.10.1.2. REST API

Configurations can be listed and set via the Ngenea Hub REST API.

Note

The configurations endpoint does not support client key authentication. You must use JWT Authentication.

To list the current configuration settings,

$ curl -s 'http://example.com/api/configurations/' -H 'Accept: application/json' -H "Authorization: Bearer $JWT_ACCESS_TOKEN"
{
    "search_backend": "analytics",
    "search_max_results": 200,
    "search_result_ttl": 7,
    ...
}

To change one or more configuration settings, make a PATCH request the same endpoint

$ curl -s -X PATCH 'http://example.com/api/configurations/' -H 'Accept: application/json' -H "Authorization: Bearer $JWT_ACCESS_TOKEN" -H 'Content-Type: application/json' -d '{"search_max_results": 500}'
{
    "search_max_results": 500,
    ...
}

5.10.1.3. Settings Migration

In versions 1.9.0 and earlier, some of the above settings were configured via the Ngenea Hub config file (Hub Configuration).

Upon updating to version 1.10.0 or above, any values currently set in that config file will be captured. Thereafter, any changes to those settings within the config file will be ignored.

5.10.2. Site-specific Configurations

Some configuration options can be set on a per-site level, and may differ between sites.

These can be viewed and changed via the REST API, as described below. They can also be viewed and changed in the Ngenea Hub UI, from the 'Sites' tab on the Administration page.

5.10.2.1. Available Settings

Name

Description

Default

bandwidth

Limit the bandwidth for the site (In MB/s). In the UI, it is hidden behind the bandwidth_controls feature flag (see Feature Flags)

not set (unlimited)

elasticsearch_url

URL used to interact with Elasticsearch when search_backend is set to analytics (see Global Configurations above). The URL is evaluated on the node(s) on which the site worker is running.

localhost:9200

pixstor_search_url

URL used to interact with PixStor Search when search_backend is set to pixstor_search (see Global Configurations above). The URL is evaluated on the node(s) on which the site worker is running.

https://localhost/

public_url

Public URL that can be used to reach this site. Typically this will be the hostname or external IP address of the site management node.

not set

file_batch_gb

Limit the total size of file data in a batch, in gigabytes. See File Batching below

1

file_batch_size

Limit the total number of files in a batch. See File Batching below

40

enable_auto_file_batch_sizing

If Dynamic file batching should be enabled for this site. See Dynamic File Batching below

True

lock_threshold

The snapdiff discovery uses locking to prevent multiple snapdiff running against the same fileset at once. To prevent stale locks, lock are considered 'expired' after the lock_threshold, given in seconds.

86400 (one day)

include

A list of include glob patterns which will apply to all workflows run against this site

not set

exclude

A list of exclude glob patterns which will apply to all workflows run against this site

not set

5.10.2.1.1. File Batching

The file list generated by discovery tasks may be broken into smaller batches before passing them to workflow steps.

This makes the overall job execution more granular. Individual tasks will be smaller and faster. This also makes it easier to cancel a job, given that only PENDING tasks can be cancelled.

On the other hand, if the batching is too small, the large number of tasks generated may saturate the job queue, blocking out tasks from other jobs.

File batching is based on both file_batch_gb and file_batch_size. Whichever limit results in a smaller batch is the one which is used. For example, given 100 files of 500MB each, a file_batch_gb of 1 and file_batch_size of 10 will result in 50 batches of 2 files each (1GB total per batch), because 1GB (2 files) is smaller than 10 files (5GB).

5.10.2.1.2. Dynamic File Batching

The Dynamic File Batching feature is designed to improve the processing of tasks with large sets of files by adjusting batch size based on the total number of files being processed from using a predefined set of ranges.

By enabling the auto file batch sizing configuration option, the system automatically adjusts file batch sizes to optimize resource utilization and enhance performance.

5.10.2.2. REST API

Configurations can be listed and set via the Ngenea Hub REST API.

The sites endpoint supports client key authentication

To list the current site configuration settings,

$ curl -s 'http://example.com/api/sites/1/' -H 'Accept: application/json' -H "Authorization: Api-Key $APIKEY"
{
    "name": "site1",
    "elasticsearch_url": "localhost:19200",
    "file_batch_size": 100,
    ...
}

Note - configurations are only included when fetching a specific site, not when listing all sites.

To change one or more configuration settings, make a PATCH request the same endpoint

$ curl -s -X PATCH 'http://example.com/api/sites/1/' -H 'Accept: application/json' -H "Authorization: Api-Key $APIKEY" -H 'Content-Type: application/json' -d '{"file_batch_gb": 5}'
{
    "file_batch_gb": 5,
    ...
}