The utils
modules provides convenience methods built for the GPFS C API.
provides more complex functionality than CLib’s other classes
which are a thin wrapper over the functions available in the GPFS C API.
Most methods require root permission
Miscellaneous Functions
Filesystem Snapshot Identifier Convenience Functions
- arcapix.fs.gpfs.clib.utils.get_fsname_by_path(path)
Get the name of the filesystem a path belongs to.
>>> get_fsname_by_path('/mmfs1/data') 'mmfs1'
- arcapix.fs.gpfs.clib.utils.get_snapname_by_path(path)
Get the name of the snapshot a path belongs to.
>>> get_snapname_by_path('/mmfs1/.snapshots/snap1/data') 'snap1'
- arcapix.fs.gpfs.clib.utils.get_path_in_snapshot(path, snap, fileset=None)
Get the equivalent of a path within a given snapshot.
>>> get_path_in_snapshot('/mmfs1/data', 'snapshot1') '/mmfs1/.snapshots/snapshot1/data'
Directory Scan Convenience Functions
- class arcapix.fs.gpfs.clib.utils.scandir(path, snapName=None)
is a directory iterator.Similar to the one in the Python 3.5 stdlib, implemented using GPFS C lib.
is a generator version ofos.listdir()
that returns an iterator over files in a directory, and also exposes extra information (such as type and stat information).When
is specified, the returned paths will be children of the specified snapshot’s directory - e.g.>>> for i in scandir('/mmfs1/data', 'snap1'): ... print i.path /mmfs1/.snapshots/snap1/data
- Parameters
- Returns
iterator of
objects for given path
- class arcapix.fs.gpfs.clib.utils.GpfsDirEntry
Object representing an directory entry, as returned by
.- inode(self) gpfs_ino64_t
Returns the inode number of the entry.
- name
Returns the name of the entry.
- path
Returns the full path of the entry.
- stat(self)
for the entry.Result comes from
- arcapix.fs.gpfs.clib.utils.listdir(path)
List the contents of a directory.
Like Python
, implemented using GPFS C Lib. As with Python, this method follows symlinks.The list is in arbitrary order. It does not include the special entries ‘.’ and ‘..’ even if they are present in the directory.
- Parameters
path (str) – path to a directory in a GPFS filesystem
- arcapix.fs.gpfs.clib.utils.walk(top, bool topdown=True, bool followlinks=False)
Walk a filesystem directory tree.
Like Python
, implemented using GPFS C LibNote: unlike
, clibwalk
doesn’t ‘see’ the.snapshots
- arcapix.fs.gpfs.clib.utils.parallel_walk(root, mapfn, reducefn=<built-in function iadd>, workers=None)
Perform a parallel walk of a GPFS directory tree.
- Parameters
mapfn – function to call for each directory entry. Receives a
object.reducefn – function to combine results from mapfn Default = addition
workers – number of worker processes to spawn Default = CPU count/2, up to a maximum of 8
Requires root permission
Inode Scan Convenience Functions
- class arcapix.fs.gpfs.clib.utils.inode_iterator(fsName, snapName=None, prevSnap=None, fromInode=0, toInode=0)
inode_iterator is an iterator object, which allows users to perform inode scans
>>> for i in inode_iterator(...): ... # do something >>> iscan = inode_iterator(...) >>> i = iscan.next() >>> j = next(iscan)
It acts as a convenience for the various
methods.- Parameters
fsName (str) – Name of the Filesystem to be scanned
snapName (str) – Name of a snapshot with the named filesystem to scan
prevSnap – Name of a previous snapshot, older than
If provided, only files that have changed since this snapshot will be returned Pass None to return all inodes fromfsName
fromInode (int) – The minimum inode number to scan from
toInode (int) – The maximum inode number to scan to. If not specified or 0, all inodes will be returned.
parameters can be used to perform multi-threaded scans.- Returns
- close(self)
Close the inode scan.
Reset Times
- class arcapix.fs.gpfs.clib.utils.SetTimesError(message)
Exception raised by
is True and times cannot be changed
- arcapix.fs.gpfs.clib.utils.reset_times(path, follow=True, precheck=True)
Reset the timestamps on a file on context exit
>>> with reset_times('/mmfs1/file'): ... # do stuff with file
- Parameters
path (str) – path of the file whose times should be reset
follow (bool) – whether to follow symlinks
precheck (bool) –
Pre-check if we will be able to reset times.
If this option is True and times can’t be changed, a
will be thrown before any code is run inside the context. This ensures that the existing times are preserved.If this option is False, then resetting times may fail silently.
(Re)setting times may fail, for example, if you aren’t the file owner or root
ACL Convenience Functions
- arcapix.fs.gpfs.clib.utils.acl.get_ace_name(ace)
Get the user or group name associated with an ACE.
ACE is an entry returned by
User and group name lookup is performed with
.These may not work for identifying users and groups in an AD environment.
- Returns
tuple of (type, name) where type is one of (special, group, user)
- Raises
KeyError if the ACE id can’t be translated to a name
- arcapix.fs.gpfs.clib.utils.acl.append_nfs4_aces(pathname, aces)
Append one or more entries to the NFSv4 ACL for a path.
This is slightly more efficient than using
since both steps are performed at the C-level- Parameters
pathname (str) – path of file or directory to get ACL for.
aces – an
NFSv4 entry or list of entries to append
Walk the filesystem
>>> import os
>>> from arcapix.fs.gpfs.clib.utils import walk
>>> for root, dirs, files in walk("/mmfs1"):
... for name in files:
... print(os.path.join(root, name))
... for name in dirs:
... print(os.path.join(root, name))
Get the filesystem a given path belongs to
>>> from arcapix.fs.gpfs import Filesystem
>>> from arcapix.fs.gpfs.clib.utils import get_fsname_by_path
>>> fs = Filesystem(get_fsname_by_path('/mmfs1/data'))
>>> print(fs.name)
Walk the filesystem for a given snapshot
>>> import os
>>> from arcapix.fs.gpfs.clib.utils import scandir
>>> def walk(root, snap):
... for i in scandir(root, snap):
... yield i.path
... # recurse into the directory
... if i.is_dir():
... for d in walk(i.path, snap):
... yield d
>>> for i in walk('/mmfs1', 'snap1'):
... print(i)
Calculate the total size of temporary files on the filesystem
>>> import os
>>> from arcapix.fs.gpfs.clib.utils import scandir, inode_iterator
>>> # iterator of inode numbers for files that end '.tmp'
>>> def find_inodes(root):
... for i in scandir(root):
... if i.name.endswith('.tmp'):
... yield i.inode()
... # recurse into the directory
... if i.is_dir():
... for d in find_inodes(i.path):
... yield d
>>> # list of inode number of '.tmp' files
>>> inodes = list(find_inodes('/mmfs1'))
>>> # create iterator - use max and min to limit scope of scan
>>> itr = inode_iterator('mmfs1', fromInode=min(inode), toInode=max(inodes)+1)
>>> # add up sizes of inodes in the inode list
>>> print(sum(x.ia_size for x in itr if x.ia_inode in inodes))
Count files in a directory tree in parallel
>>> from arcapix.fs.gpfs.clib.utils import parallel_walk
>>> # define a map function to count files only
>>> def count_files(dirent):
... if dirent.is_file():
... return 1
... return 0
>>> # perform a parallel directory tree walk
>>> count = parallel_walk('/mmfs1/data', count_files, workers=4)
>>> print(count)
Read a file without updating its atime
>>> import os
>>> from arcapix.fs.gpfs.clib.utils import reset_times
>>> print(os.stat('/mmfs1/hello.txt').st_atime)
>>> with reset_times('/mmfs1/hello.txt'):
... with open('/mmfs1/hello.txt', 'r') as f:
... print(f.read())
hello world
>>> print(os.stat('/mmfs1/hello.txt').st_atime)
Add a new entry to a file ACL
Grant read/write permission for the ‘admin’ group
This may be combined with arcapix.fs.gpfs.clib.utils.walk()
to add the new ACE to a directory tree, recursively.
>>> import grp
>>> from arcapix.fs.gpfs.clib.utils.acl import append_nfs4_aces
>>> from arcapix.fs.gpfs.clib.acl import ace_v4, AM_READ, AM_WRITE, AF_GROUP_ID
>>> # define the new entry
>>> gid = grp.getgrnam('admin').gr_gid
>>> ace = ace_v4(aceWho=gid, aceFlags=AF_GROUP_ID, aceMask=AM_READ|AM_WRITE)
>>> # append the new entry to the target file
>>> append_nfs4_aces('/mmfs1/test', ace)