API
Introduction
At the time of writing, Grid5000 is mainly accessed and operated via shell commands executed from frontend machines. To ease scripting and provide better access to the grid functionalities, an API is currently developed by the Grid5000 Technical Committee. This API consists in a layer on top of most of the Grid5000 tools that will provide stable and well-defined interfaces accessible via the HTTP protocol.
Most of the standardization effort is achieved by trying to be as close as possible to the principles of a REST (REpresentational State Transfer) architecture, which recommends to define the semantics of an operation on a resource by means of the standard HTTP operations (e.g. GET, PUT, POST, DELETE) and the standard HTTP status codes (e.g. 200, 401, 404, 500).
The concept of a REST resource must not be confused with the concept of a resource in the context of Grid5000: a REST resource, identified by a URI (e.g. /sites/rennes), represents a specific view into an object, which MAY incidentally be a Grid5000 resource (e.g. node, switch) but not necessarily.
As a consequence of using the HTTP protocol as an application protocol, any HTTP client can be used to query the API: command-line tools (cURL), browsers and the numerous HTTP libraries available in the vast majority of programming languages.
Access to the API
The API entry-point is:
https://api.grid5000.fr
However, during the first phase of the deployment of the Grid5000 API, the access is restricted to requests coming from one of the Grid5000 ACCESS machines. Thus, if you are connected on a machine that is not an ACCESS machine, you must create an SSH tunnel between your machine and an ACCESS machine of your choice:
$ ssh -NL 3443:api.grid5000.fr:443 login
@access.bordeaux.grid5000.fr
As a consequence, the API entry-point, as seen from your local machine, will be:
https://localhost:3443
When you're done with the API, hit CTRL-C to destroy the tunnel, unless you added the -f flag to run it in the background.
- IMAGE
Versioning
The Grid5000 API is made up of multiple APIs that may evolve independently from each other. Changes occurring in one API are reflected by a new version number for this particular API. In the long term, scripts or applications that are built upon multiple APIs may have to deal with multiple version numbers for each API, which is not a very enjoyable perspective. That's why the Grid5000 API is globally versioned: a global version id will map to the specific API version numbers in a transparent manner. All you have to do is to suffix the API entry-point with the global version id of your choice:
https://api.grid5000.fr/global-version-id
Or, if you connect via an SSH tunnel:
https://localhost:3443/global-version-id
The format of the global version id is as follows:
major
.minor
-quality
e.g.
1.0-stable
Scripts and applications that are programmed against a specific global version id of STABLE quality should never encounter problems due to changes in the APIs, until the official deprecation of the version.
The list of the past and current versions supported by the Grid5000 API is available at: #Global_Version_IDs.
Resources
The resources that can be queried are described in the documentation of each API.
The resources are identified by a URI, which should be added to the end of the API entry-point to form the URL.
For example, /sites/rennes/versions/current
is a resource exposed by the Reference API. If you would like to get the representation of this resource as it is returned by the 1.0-stable
version of the Grid5000 API, the URL to enter in your HTTP client would be:
https://api.grid5000.fr/1.0-stable/sites/rennes/versions/current
Or, if you connect via an SSH tunnel:
https://localhost:3443/1.0-stable/sites/rennes/versions/current
The list of the APIs and their documentation is available at: #APIs.
Format
When you query a REST resource, you ask for a representation of an object. Depending on the resource you are requesting, there could be multiple representations available. It is the responsibility of the user to indicate which representation (or format) it accepts.
To set the format that you accept, you could either explicitly insert the format at the end of the URL, e.g.:
GETentry-point
/global-version-id
/resource
.json
Or, you may prefer to set the Accept HTTP header to the correct mime type, e.g.:
GETentry-point
/global-version-id
/resource
Accept: application/json
If you put both, the Accept HTTP header will be ignored.
Status Codes
A subset of the standards HTTP status codes [1] is returned by the APIs:
- A 200 status code is returned when the request is successful.
- A 201 status code is returned when the request has been fulfilled and resulted in a new resource being created.
- A 304 status code is returned on a conditional GET request [2] when the requested resource has not been modified.
- A 403 status code is returned when the access to the requested resource is forbidden to the currently logged user.
- A 404 status code is returned when a resource does not exist.
- A 406 status code is returned when the requested format is not available.
- A 500 status code is returned when the server encountered an error.
The body of responses whose status code is 4xx or 5xx may contain additional information about the error.
Authentication
For now, the authentication is transparently managed for the user: since the requests must come from an ACCESS machine, the login of the connected user can be retrieved by the API. In the long term, new authentication schemes may be implemented.
Caching
Warning | |
---|---|
On a first read, this section can be skipped without any regrets. For those interested in improving the performance of their client applications, read on ! |
As described in the RFC2616:
HTTP is typically used for distributed information systems, where performance can be improved by the use of response caches. The HTTP/1.1 protocol includes a number of elements intended to make caching work as well as possible.
Hence, client applications can (and should) cache the responses so that subsequent requests for the same information use the cached data. The use of caching strategies can dramatically reduce delays and save bandwidth.
That's why most of the responses returned by the Grid5000 APIs include HTTP headers to support one or both of the caching models: expiration-based or validation-based. If you don't know what this means, read this article: http://tomayko.com/writings/things-caches-do.
Below are the different schemes that can exist when the API returns cacheable responses.
1. no caching
O |-----| -|- <-------internet------> | API | / \ |-----|
2. the API builds the response once and stores it into cache for a certain amount of time.
O |-----| |-----| -|- <-------internet------> |CACHE| <---> | API | / \ |-----| |-----|
3. the client receives the response once and stores it into cache for a certain amount of time.
O |-----| |-----| -|- <---> |CACHE| <-------internet------> | API | / \ |-----| |-----|
4. both the client and the API have a cache in front of them.
O |-----| |-----| |-----| -|- <---> |CACHE| <-------internet------> |CACHE| <---> | API | / \ |-----| |-----| |-----|
Most of the APIs of Grid5000 will use some kind of caching strategies from their side (scheme 2). It is recommended that client applications include a caching strategy in their implementation (scheme 4): this will save bandwidth, reduce latency and may improve the tolerance of the client to network outages.
Note that only a few HTTP libraries natively support client-side caching (e.g.: httplib2 in Python). For ruby, a gem is available from Github (cryx-cacheability), that extends the RestClient library.
APIs
For each API, you can access its documentation by clicking on the 'Documentation' link.
However, if this is the first time you read this page, we recommend that you take some time to go through the tutorials:
- Beginning with the ReferenceAPI tutorial, you will learn how to access the current description of the platform: the sites, clusters, nodes, etc. as well as the description of the platform as it was at a time in the past. We will also show you how to get the latest changes brought on a specific resource, and how to select nodes based on their properties.
- The MonitoringAPI tutorial builds on the first tutorial and introduces the MonitoringAPI, which returns the status of a specific Grid5000 resource so that you can monitor them or make decisions based on the availability of resources. In particular, you'll learn how to select nodes that match specific criteria in terms of properties and availability.
- The JobsetAPI tutorial describes the notion of jobset and shows you how to submit a grid job using the API. The final script nicely builds on the two previous tutorials to showcase how to submit jobs that match your requirements in terms of resource distribution and properties, without any human interaction.
Reference API
Monitoring API
Jobset API
Global Version IDs
The sid
quality should not be used for production purposes as the interfaces may change at any time.
- 1.0-stable : uses ReferenceAPI-0.1, MonitoringAPI-0.1
- 1.0-sid : uses JobsetAPI-0.1, ReferenceAPI-0.1, MonitoringAPI-0.1
FAQ
"Certificate verification failed"
Your HTTP client may produce a warning or refuses to connect if you do not have a version of the Grid5000 Certification Authority (CA) on your machine. Currently though, the certificate infrastructure of Grid5000 is undergoing massive changes and we recommend that you disable the verification during this period.
Resources
- cURL, the recommended tool for command-line access to the API.
- RestClient, a Ruby HTTP client.
- py-RestClient, a Python HTTP client
References
- http://json.org/ Introducing JSON
- http://www.w3.org/Protocols/rfc2616/rfc2616-sec10.html HTTP/1.1: Status Code Definitions