5.13. Limitations¶
5.13.1. Site Sync¶
5.13.1.1. Site Sync must only be used in one direction¶
The intent of site_sync
is for one source site to synchronise all required changes to any amount of destination sites and not consider the state of the destination site(s). Site Sync must only be utilised when data is required to be synchronised without concern for any active changes on destination site(s).
5.13.1.2. Site Sync only considers changes within an applicable time window for synchronisation¶
Synchronisation is not ‘event driven’. Changes are collated within a window of time (defined by the associated schedule), and sent as a group.
The order in which certain events occurred cannot be determined. For example; delete determination; where no data point exists to determine the ‘change time’ [ctime] of a file or directory due to deletion prior to the sync time window.
During bidirectional synchronisation, when conflicting create/modify and delete events occur for files, the create/modify event takes precedence over the delete event to prevent data loss. In such scenarios, a newer version of the file which was prior deleted will be present after synchronisation. Directory behaviour is not affected.
5.13.1.3. Site Sync supports Independent Filesets¶
Site Sync methodology is incompatible with any requirement to synchronise an entire file system, nor is use of Dependent Filesets, or arbitrary directory trees supported.
5.13.1.4. Destination site Independent Filesets must exist prior to synchronisation¶
Site Sync does not create Independent Filesets on destination site(s) prior to synchronisation. Destination Independent Fileset creation is an administrative function and must be undertaken prior to configuration and operation of a Site Sync methodology to the destination site.
5.13.1.5. Directory deletion is not supported¶
Site sync does not support directory deletion. Deletion of a large file tree structure on a source site will delete files within the directory structure, resulting in an empty directory tree on the destination site.
5.13.2. Bidirectional Site Sync¶
5.13.2.1. Bidirectional Site Sync implements eventual consistency¶
Site Sync adheres to the principle of eventual consistency whereby one or more subsequent Site Sync jobs or tasks are required to be enacted for source and destination sites to be in sync . Prior to all required Site Sync jobs enacting the synchronisation status is viewed as partially synchronised. Each subsequent job increases the totality of synchronisation.
Data which has failed to be synchronised in prior synchronisations is placed into subsequent synchronisation runs, leading to eventual consistency.
5.13.2.2. Bidirectional Site Sync is only supported via schedules¶
Ref: Schedules
A bidirectional Site Sync is created via defining a schedule using the bidirectional_snapdiff
discovery and subscribing the bidirectional_sync
workflow to the schedule.
A path may only be managed by one schedule per site, and at most one workflow may be subscribed to a bidirectional_snapdiff
schedule.
Hub does not support multiple bidirectional synchronisation with the same source site for the same Independent Fileset (E.G. between site1 and site2, and between site1 and site3).
The bidirectional_snapdiff
discovery causes all iterative Independent Fileset changes to be tracked. Identified changes are compared to across iterative runs. When both sets of changes have been evaluated, only appropriately valid changes are synchronised to the destination site.
5.13.2.3. Bidirectional synchronisation is sequential¶
When setting up a bidirectional site sync ‘site1’ is the site which is configured when creating the schedule, and ‘site2’ is the site configured as the destinationsite
when subscribing the workflow to the schedule.
A bidirectional site sync will first synchronise changes from site1 to site2, and then from site2 to site1.
A failure while synchronising site1 to site2 will not block the reverse direction sync from enacting.
The sequential nature of synchronisation ensures conflict situations where the synchronisation from site2 to site1 wins by virtue of the last write [most recent write at any site] of a file taking precedence.
5.13.2.4. Swapping files / Last write wins¶
Synchronisation of a set of file moves whereby the actions include synchronisation determination of the paths of two files being swapped is complex and can result in conflicts. Where identical file paths exist at the destination site, file moves will fail rather than overwriting the existing files.
Manual intervention is required before a synchronisation will again succeed. N.B.: the replaying of the swap of the files on the destination site is not sufficient to resolve the conflict.
5.13.2.5. Renaming or deleting files and directories causes re-sending of data¶
Where a file is moved on site1 and the same file is deleted on site2 during an active synchronisation, the moved file from site1 will be resent to site2. This behaviour will be observed even if the delete event occurred later chronologically.
5.13.2.6. Renaming or files and directories on both bidirectional Site Sync sites causes duplication of data¶
This behaviour is observed when a file or directory is moved on both sites to different locations during an active synchronisation. E.G.:
/mmfs1/data/path1
is moved to/mmfs1/data/path2
onsite1
/mmfs1/data/path1
is moved to/mmfs1/data/path3
onsite2
This scenario results in duplicate data at both sites in both /mmfs1/data/path2
and /mmfs1/data/path3
.
5.13.2.7. Creation or deletion of empty directories on site 1 does not synchronise to site 2¶
Site Sync does not perform deletions of empty directories on a destination site and does not create empty directories on a destination site.