Walkthrough: Creating a Space
Introduction
On this page, we’re going to walk-through creating a templated space using the PixStor Management REST API.
You can see a more thorough overview of the REST API, as well as examples of using it via cURL on the REST API Overview page.
For this guide, we’re going to work in Python, using the requests library
We’re going to assume the REST server is running on localhost, behind an NGINX proxy, which provides SSL termination.
For convenience, lets store the server url in a variable
url = "https://localhost"
Objective
We want to create a Space for a new project we’re working on, called ‘Sleepy Snake’
Here, we’re assuming that the underlying filesystem is PixStor.
So in PixStor terms, what we want to do is
create an independent fileset
on the ‘mmfs1’ filesystem
we want the data to be placed in the ‘sas1’ pool
we want the fileset to have a particular project layout
we want the fileset to have a size (block quota) of 4GB
In PixStor Management terms, this means creating a templated space
Authentication
Before we can do anything else, we need to get an auth token
The auth server url will be configured in APConfig (see Configuration Directives), so lets lookup that url
from arcapix.config import config
authserver = config['arcapix.auth.server.url'] # typically https://localhost
Now we can request an access token from the auth server
import requests
payload = {'grant_type': 'password', 'username': 'myuser', 'password': 'mypassword'}
resp = requests.post(authserver + '/oauth2/token', data=payload)
assert resp.status_code == 200
token = resp.json()['access_token']
Tip
If the request raises SSLError("bad handshake ...")
it’s likely because of self-signed certificates.
This can be resolved by adding verify=False
to the request
For more information, see SSL Cert Verification
To make use of this access token, we have to use HTTP basic auth, with the token as the username and with an empty password.
For convenience, lets create a requests session and apply the auth to it, so that we don’t have to explicitly pass auth to every future requests
session = requests.Session()
session.auth = (token, '')
# if you get SSLErrors from the self-signed certificates, add the following
# session.verify = False
From now on, we’ll assume that all our requests are successful, but in practice you should always check status codes.
Auth Roles
Before we try to create a space, we should make sure that we’re actually allowed to create a space.
Our user/group will have associated with it a collection of authentication roles, and these roles determine what operations we’re allowed to perform on what endpoints.
We don’t have to check the actual roles, though. We can just do an OPTIONS
request
against the /spaces
endpoint and check the Allow
header
from __future__ import print_function
resp = session.options(url + '/spaces/')
print(resp.headers['Allow'])
# HEAD, GET, POST, OPTIONS
Here, we can see that we do have permission to perform POST
requests against the /spaces
endpoint, so we can indeed create spaces!
Note
Different endpoints may have different permissions - you may have permission to create spaces but not profiles, for example.
In general it’s a good idea to check OPTIONS
before trying to create an object.
Collection+JSON Template
To create a Space, we need to know what fields should be provided with our POST request.
Fortunately, PixStor Management uses the Collection+JSON (C+J) format, which provides us with a template for creating a new objects
So lets check the template from the /spaces
endpoint
resp = session.get(url + '/spaces/')
print(resp.json()['collection']['template'])
{
"data": [
{
"prompt": "space name",
"name": "name",
"value": ""
},
{
"prompt": "path of the space relative to its exposers",
"name": "relativepath",
"value": ""
},
{
"prompt": "templates applied to the space",
"name": "templates",
"value": ""
},
{
"prompt": "exposers providing access to the space",
"name": "exposers",
"value": ""
},
{
"prompt": "profile applied to the space",
"name": "profile",
"value": ""
},
{
"prompt": "hard limit on the size of the space in blocks",
"name": "size",
"value": ""
}
]
}
So we need to provide a name, a relative path, templates, exposers, a profile, and a size.
Okay, so going back to the objective (above), lets call the Space sleepy-snake
, with relative path projects/sleepy_snake
-
that’s relative to the exposers (filesystem). And we’ll give it a size of 4GB.
Warning
Space names can’t contain whitespace - they can only contain letters, numbers, hyphens and underscores.
If you try to use a name containing ‘invalid’ characters, the POST request will return a 422 (Unprocessable Entity)
error
resp = session.post(url + '/spaces/', json={'name': 'sleepy snake', ...})
print(resp.json())
# {'collection': {'error': {'message': 'Insertion failure: 1 document(s) contain(s) error(s)', 'code': 422, 'title': 'Error'}}}
But what about the Profile and the Exposers?
Finding where to put things
It’s not possible to create a filesystem or a pool via the REST api (yet).
However, PixStor Management is populated with objects based on what already exists - so we get an exposer for every filesystem, and a data store for every pool. In addition, a special placement policy rule is created for each pool, resulting in corresponding ‘default’ profiles.
So we need to lookup the exposer and profile for our filesystem and placement pool of choice.
Pool
We want our data to be placed in pool sas1
. As mentioned above, the database should have been pre-populated
with a datastore for pool sas1
, and with a profile to assign data to that datastore.
The naming scheme for the pre-populated default placement profiles is {filesystem}-{pool}
, so in our case,
we want to find the profile named mmfs1-sas1
.
Tip
It is possible to create your own profile with additional ilm steps (migration rules), and with placement controls, such as matching certain file types or file size ranges, etc.
But that won’t be covered in this walkthrough
So how do we find the profile with a particular name?
Collection+JSON Queries
Once again, the C+J helps us out by providing models for queries 1
resp = session.get(url + '/profiles/')
print(resp.json()['collection']['queries'])
[
{
"prompt": "Search by Name",
"href": "/profiles?where={\"name\":\"{name}\"}",
"data": [
{
"prompt": "profile name",
"name": "name",
"value": ""
}
],
"rel": "search",
"encoding": "uri-template"
},
...
]
This shows us how to construct a query to search for a profile by name.
The href
gives the template for the url, and the data block tells use what parameter we need to replace in that href.
Here, we have to replace the name parameter {name}
with the actual name we want to search for, giving us
"/profiles?where={\"name\":\"mmfs1-sata1\"}"
So lets perform this query in python - we’re using params
for readability, but the result is the same
params = {'where': '{"name": "mmfs1-sas1"}'}
resp = session.get(url + '/profiles/', params=params)
resp.headers['X-Total-Count'] # '1'
print(resp.json())
{
"collection": {
"items": [
{
"href": "/profiles/8f01ab22-cc0b-2056-ff8c-e1d829dc806c",
"data": [
{
"prompt": "current status of the item",
"name": "status",
"value": "ACTIVE"
},
{
"prompt": "unique identifier for the item",
"name": "id",
"value": "8f01ab22-cc0b-2056-ff8c-e1d829dc806c"
},
{
"prompt": "profile name",
"name": "name",
"value": "mmfs1-sas1"
},
...
],
"links": [...]
}
],
"href": "/profiles?where={\"name\": \"mmfs1-sas1\"}",
"links": [...],
"template": {...},
"queries": [...],
"version": "1.0",
}
}
The full C+J response is quite long and unwieldy, so the above has been truncated.
Tip
For a quick sanity check, we can look at the X-Total-Count
header to see how many results the query returned.
Profile names are unique, so logically, there should be only one.
Referencing Items
When referencing an item in a POST request, we can provide either the item’s href
or its id
.
The href is preferred, since it uniquely identifies an item - it’s possible for items of different types with the same id, whereas the href explicitly includes the id AND the type of item it is.
The href is also easier to extract from the C+J response.
In the response above, we see that we can get our profile item as resp.json()['collection']['items'][0]
.
And at the very top of that item, we see a field for href
So we can get the profile href from our query like so
profile = resp.json()['collection']['items'][0]['href']
# "/profiles/8f01ab22-cc0b-2056-ff8c-e1d829dc806c"
Filesystem
We want to create our space in the mmfs1
filesystem, so we need to find the corresponding exposer.
Unlike profiles, there are multiple different types of exposer - including native, nfs, smb.
So in addition to a name, we also have to query for the right exposer type
.
A PixStor filesystem is represented in PixStor Management as a GPFSNativeExposer
, which has type='gpfsnative'
So to find the mmfs1
filesystem we can query the /exposers
endpoint, again using the where
url parameter
params = {'where': '{"type": "gpfsnative", "name": "mmfs1"}'}
resp = session.get(url + '/exposers/', params=params)
print(resp.json())
{
"collection": {
"items": [
{
"href": "/exposers/d672fde7-ba69-5d16-0acd-5868d2a8f3b9",
"data": [
{
"prompt": "current status of the item",
"name": "status",
"value": "ACTIVE"
},
{
"prompt": "specific type of the exposer",
"name": "type",
"value": "gpfsnative"
},
{
"prompt": "path at which the exposer is mounted",
"name": "mountpoint",
"value": "/mmfs1"
},
{
"prompt": "unique identifier for the item",
"name": "id",
"value": "d672fde7-ba69-5d16-0acd-5868d2a8f3b9"
},
{
"prompt": "exposer name",
"name": "name",
"value": "mmfs1"
},
...
],
"links": [...],
"href": "/exposers?where={\"type\": \"gpfsnative\", \"name\": \"mmfs1\"}",
"links": [...],
"template": {...},
"queries": [...],
"version": "1.0",
}
}
Again, we expect exactly one result, and we can get the exposer href the same as before
exposer = resp.json()['collection']['items'][0]['href']
# "/exposers/d672fde7-ba69-5d16-0acd-5868d2a8f3b9"
Hint
If there are no results for the exposers query, it’s possible the database hasn’t been populated (yet). You can manually populate PixStor Management by running the command
$ adminctl populate now
Making a Project Template
The last thing we want before we can create our space is a template - a pre-defined directory layout that we can apply to our new space, and to any future spaces we might create
Note
It can be confusing, but try not to mix up Template objects with the C+J creation template discussed above
Template Model
The template we want to use doesn’t exist yet, so we have to create it.
To do this, we create a ‘model’ of the directory layout we want
$ tree /mmfs1/project_template/
/mmfs1/project_template/
├── assets # <-- this is a dependent fileset
│ ├── models
│ └── rigs
├── flame
├── houdini
├── maya
├── mudbox
├── nuke
├── published
└── rendering
Along with the directory layout, the model can include files and dependent filesets. We can even set up permissions, which the template will capture.
POSTing the Template
Once we have our template model, we create the actual template via the REST interface.
As with spaces, we can lookup the C+J ‘template’ for the fields we need to POST
resp = session.get(url + '/templates/')
print(resp.json()['collection']['template'])
{
"data": [
{
"prompt": "template name",
"name": "name",
"value": ""
},
{
"prompt": "specific type of the template",
"name": "type",
"value": ""
},
{
"prompt": "path to the template",
"name": "template_location",
"value": ""
}
]
}
We need to provide the name, the type (filesystemtemplate
in this case), and the location of the template’s model
To perform a POST request with C+J, we have to fill in the value
fields in the C+J template we just looked up
template_data = {
"template": {
"data": [
{"name": "name", "value": "project_template"},
{"name": "type", "value": "filesystemtemplate"},
{"name": "template_location", "value": "/mmfs1/project_template"}
]
}
}
(You don’t need to include the prompt fields, but if you do include them, they will just be ignored)
We then POST this data to the /templates
endpoint
resp = session.post(
url + '/templates/',
json=template_data,
headers={"Content-Type": "application/vnd.collection+json"}
)
print(resp.status_code) # 202 (Accepted)
Important
As shown above, we need to include the Content-Type: application/vnd.collection+json
header
so the REST server knows what JSON format it is receiving
We’re using json=
because data=
would ‘form encode’ the data, resulting in a 400 (Bad Request)
error.
Important
The trailing slash on the end of the url /templates/
is required.
Without it, the POST request will fail.
If there were no issue with the request, we should get back status code 202 (Accepted)
.
Template Builder
A 202 status means the new template has been added to the database, and a task has been submitted to the job engine.
This builder task will copy the template model into the configured template store (see Configuration Directives).
The response from the POST request will include a Location
header, which we can query to check the status of our template
print(resp.headers['Location'])
# 'https://localhost/templates/e00e872b-f4c5-2557-795d-4ccf4715b602?projection={"status":1}'
Because of the way C+J is structured, it’s not easy to grab just the status
field from the response.
Lets write a little helper function
def get_status(collection, item):
for data in collection['collection']['items'][item]['data']:
if data['name'] == 'status':
return data['value']
else:
raise KeyError("Status field not found")
Now lets check the status of our template
resp = session.get(resp.headers['Location'])
print(get_status(resp.json(), 0))
# 'ACTIVE'
The status will transition from NEW
(data POSTed), to PENDING
(task submitted), to INPROGRESS
(task running), to ACTIVE
When the template reaches state ACTIVE
, we know that the builder job has completed successfully, and it’s ready to use.
template = resp.json()['collection']['items'][0]['href']
# "/templates/e00e872b-f4c5-2557-795d-4ccf4715b602"
Note
Once the template has been ingested, its template_location
field is updated (internally)
to point to the location of template within the template store.
On subsequent GET requests, the template_location
field will be returned as null
Tip
Once the template is built, the original model can be modified or even deleted without affecting the template.
Checking the Builder Task
Say we want to check the status of the builder task itself, rather than watching the template’s status,
or say the template enters state ERRORED
, implying that the builder task failed.
When we look up an item, included in the response (in the items
block) is a links
block.
print(resp.json()['collection']['items'][0]['links'])
[
{
"href": "/jobs?where={\"resource_type\": \"templates\", \"resource_id\": \"e00e872b-f4c5-2557-795d-4ccf4715b602\"}",
"prompt": "Jobs",
"name": "Jobs",
"render": "link",
"rel": "jobs"
}
]
Here we see a link to a jobs
query.
Lets do another helper function
def get_link_href(collection, item, name):
for link in collection['collection']['items'][item]['links']:
if link['name'] == name:
return link['href']
else:
raise KeyError(name)
Visiting the Jobs
link will return a collection of all jobs associated with our template
href = get_link_href(resp.json(), 0, 'Jobs')
resp = session.get(url + href)
In this instance, there should be only one job returned, since we’ve only submitted one task for our template (the builder task)
If the job is no longer active (COMPLETED
or ERRORED
), we can check the job’s logs to try and diagnose any issues.
In the job’s links
section, we should see links for stdout
and stderr
print(resp.json()['collection']['items'][0]['links'])
[
{
"href": "/templates/e00e872b-f4c5-2557-795d-4ccf4715b602",
"prompt": "Resource",
"name": "resource",
"render": "link",
"rel": "resource"
},
{
"href": "/jobs/407ee461-5cc1-43e5-8f3e-a375f8f9b9c8/stderr",
"prompt": "Stderr",
"name": "stderr",
"render": "link",
"rel": "stderr"
},
{
"href": "/jobs/407ee461-5cc1-43e5-8f3e-a375f8f9b9c8/stdout",
"prompt": "Stdout",
"name": "Stdout",
"render": "link",
"rel": "stdout"
}
]
(We also get a link back to our template in the resource
link)
Performing a GET request against the stderr
link will return the log in plain text format
href = get_link_href(resp.json(), 0, "stderr")
resp = session.get(url + href)
print(resp.headers['Content-Type']) # text/plain
print(resp.text)
# 'DEBUG:...
Note
Python logging is written to stderr by default, so typically will end up in the stderr
log.
The stdout
log would contain anything written to stdout, such as print
statements.
But since none of the PixStor Management tasks use print statements, the stdout
log will usually be empty.
However, this behaviour may vary depending on which job engine is being used.
Creating the Space
Now, finally, we have everything we need to create our Space.
So, as we did for our project template above, we need to fill in the C+J template values
space_data = {
"template": {
"data": [
{"name": "name", "value": "sleepy-snake"},
{"name": "relativepath", "value": "projects/sleepy_snake"},
{"name": "exposers", "value": exposer},
{"name": "profile", "value": profile},
{"name": "templates", "value": template},
{"name": "size", "value": 4*1024*1024*1024} # 4GB
]
}
}
Tip
Here we only have one exposer and one template.
If instead, we wanted to create a space with multiple exposers (or templates) we would send their hrefs as a comma-separated list - e.g.
"value": '/exposers/bb2873a8-c489-4530-a8a5-ece70598f3ea,/exposers/8b234c97-23a1-4f2b-9fce-1200d40e96c1'
Then we POST the data to the /spaces
endpoint and wait for our new space to enter an ACTIVE
state
resp = session.post(
url + '/spaces/',
json=space_data,
headers={"Content-Type": "application/vnd.collection+json"}
)
checkurl = resp.headers['Location']
from time import sleep
# query the status every second for 10 seconds
for _ in range(10):
sleep(1)
resp = session.get(checkurl)
status = get_status(resp.json(), 0)
if status == 'ACTIVE':
break
else:
raise Exception("Space didn't become active after 10s - got status '%s'" % status)
And we’re done!
Checking Our Work
Database
First of all, lets check what our space looks like in the PixStor Management database
The Location
header for our space uses a projection to only return the status
field,
not the other data fields, so we can’t use that.
Lets try doing a query for our space
params = {'where': '{"name": "sleep-snake"}'}
resp = session.get(url + '/spaces/', params=params)
print(resp.json())
{
"collection": {
"href": "/spaces/",
"items": [
{
"href": "/spaces/de95cf28-fdf9-5455-b1d9-831cf8a1869b",
"data": [
{
"prompt": "current status of the item",
"name": "status",
"value": "ACTIVE"
},
{
"prompt": "space name",
"name": "name",
"value": "sleepy-snake"
},
{
"prompt": "path of the space relative to its exposers",
"name": "relativepath",
"value": "projects/sleepy_snake"
},
{
"prompt": "unique identifier for the item",
"name": "id",
"value": "de95cf28-fdf9-5455-b1d9-831cf8a1869b"
},
{
"prompt": "hard limit on the size of the space in blocks",
"name": "size",
"value": 4294967296
},
...
],
"links": [
{
"href": "/templates/?where=id==\"e00e872b-f4c5-2557-795d-4ccf4715b602\"",
"prompt": "templates applied to the space",
"name": "templates",
"render": "link",
"rel": "templates collection"
},
{
"href": "/exposers/?where=id==\"d672fde7-ba69-5d16-0acd-5868d2a8f3b9\"",
"prompt": "exposers providing access to the space",
"name": "exposers",
"render": "link",
"rel": "exposers collection"
},
{
"href": "/profiles/8f01ab22-cc0b-2056-ff8c-e1d829dc806c",
"prompt": "profile applied to the space",
"name": "profile",
"render": "link",
"rel": "profile item"
},
{
"href": "/jobs?where={\"resource_type\": \"spaces\", \"resource_id\": \"de95cf28-fdf9-5455-b1d9-831cf8a1869b\"}",
"prompt": "Jobs",
"name": "Jobs",
"render": "link",
"rel": "jobs"
}
]
}
],
"links": [...],
"template": {...},
"queries": [...],
"version": "1.0",
}
}
Looks good.
Notice that the related items - exposers, templates, profile - appear in the links section. This allows us to look up those items without having to figure out their urls.
Note
It is possible to have multiple spaces with the same name, but only if they have different profiles.
In general, you should avoid creating multiple spaces with the same name.
Filesystem
Now, lets check that a fileset was actually created for our Space
$ mmlsfileset mmfs1
Filesets in file system 'mmfs1':
Name Status Path
...
sas1-sleepy-snake Linked /mmfs1/projects/sleepy_snake
sata1-sleepy-snake-8b234c97 Linked /mmfs1/projects/sleepy_snake/assets
The first one, sas1-sleepy-snake
, is the fileset created for our space.
Notice that the fileset doesn’t have exactly the name we specified for the space - it has the name of its placement pool stuck on the front. This prefix is used for matching the space (fileset) to the right pool for the space’s profile (placement policy rule)
$ mmlspolicy mmfs1 -L
...
RULE 'sas1-placement' SET POOL 'sas1'
WHERE FILESET_NAME LIKE 'sas1-%'
RULE 'default' SET POOL 'sata1'
The second fileset shown, sata1-sleepy-snake-8b234c97
, is the dependent fileset installed by our template.
Instead of the space’s placement pool, this name is prefixed with the pool that the original, model fileset was assigned to -
in this case sata1
. The name also includes the name of our space (plus a random suffix).
Lets check the full template was installed
$ tree /mmfs1/projects/sleepy_snake
/mmfs1/projects/sleepy_snake
├── assets
│ ├── models
│ └── rigs
├── flame
├── houdini
├── maya
├── mudbox
├── nuke
├── published
└── rendering
Great! Finally, the size
value we specified should have been translated into a block quota
$ mmlsquota -j sas1-sleepy-snake mmfs1
Block Limits | File Limits
Filesystem type KB quota limit in_doubt grace | files quota limit in_doubt grace Remarks
mmfs1 FILESET 109 3865470566 4294967296 0 none | 10 0 0 0 none
Perfect! Our space is now ready to use.
Exercise: Create a Snapshot
Now that you’ve made yourself a space, why not practice what you’ve learned by creating a snapshot of that space
Hints
The endpoint you’re looking for is
/snapshots
To create a space snapshot, the only fields you need to POST are
name
,type
, andspace
The type you want to use is
gpfsspacesnapshot
Solution
Click the (+) icon on the right to reveal the solution…
params = {'where': '{"name": "sleep-snake"}'}
resp = session.get(url + '/spaces/', params=params)
space = resp.json()['collection']['items'][0]['href']
snapshot_data = {
"template": {
"data": [
{"name": "name", "value": "snake-snap"},
{"name": "type", "value": "gpfsspacesnapshot"},
{"name": "space", "value": space},
]
}
}
resp = session.post(
url + '/snapshots/',
json=snapshot_data,
headers={"Content-Type": "application/vnd.collection+json"}
)
Full Code
The full code for this walkthrough can be found here:
Footnotes
- 1
uri-templates are not part of the standard Collection+JSON spec