API: Difference between revisions

From Grid5000
Jump to navigation Jump to search
No edit summary
Line 2: Line 2:
{{Maintainer|Cyril Rohr}}
{{Maintainer|Cyril Rohr}}


== ReferenceAPI ==
== Introduction ==
=== Synopsis ===
At the time of writing, Grid5000 is mainly accessed and operated via shell commands executed from frontend machines. To ease scripting and provide better access to the grid functionalities, an API is currently developed by the Grid5000 Technical Committee. This API consists in a layer on top of most of the Grid5000 tools that will provide stable and well-defined interfaces accessible via the HTTP protocol.
The reference API provides the reference data of Grid5000. Information such as the list of sites, clusters, nodes, environments, etc. can be queried using this API.
You can also obtain a specific version of any data, list all the versions of a given information, and get an archive of all or part of the data.


=== REST API ===
Most of the standardization effort is achieved by trying to be as close as possible to the principles of a '''REST''' (REpresentational State Transfer) architecture, which recommends to define the semantics of an operation on a '''resource''' by means of the '''standard HTTP operations''' (e.g. GET, PUT, POST, DELETE) and the '''standard HTTP status codes''' (e.g. 200, 401, 404, 500).
* API entry-point: http://localhost:8080/reference/0_1
* note on formats: it is not required to explicitly insert the format in the URL, e.g.:
  GET /resource.json
You may prefer to set the Accept HTTP header to the correct mime type, e.g.:
  GET /resource
  Accept: application/json
If you put both, the Accept HTTP header will be ignored.


==== Get a specific version of a resource ====
The concept of a REST resource must not be confused with the concept of a resource in the context of Grid5000: a REST resource, identified by a URI (e.g. /sites/rennes), represents a specific view into an object, which MAY incidentally be a Grid5000 resource (e.g. node, switch) but not necessarily.
  GET /<code class="replace">%resource%</code>.<code class="replace">%format%</code>?version=<code class="replace">%version%</code>&depth=<code class="replace">%depth%</code>
* Accepted formats: json, zip
* Version parameter: can be omitted (latest version is returned), or it can be: a commit id (40 characters length) or a Unix UTC time (number of seconds since the Unix epoch: 1970-01-01 00:00:00 UTC)
* Depth parameter: can be omitted (by default, there is only one level of nested sub-resources to be displayed in a json response). If you want to get more details with one request, you can set the depth parameter to a value between 1 and 3. This parameter has not effect with the zip format.
* Comments: the zip format will return a zip archive containing the set of directories and files corresponding to the required data, '''with all its sub-resources'''.
* Examples:
  GET /.json
will return the description of the "root" resource, which is Grid5000:
  HTTP/1.1 200 OK
  Etag: "d69bfd1891582824f5a192fa41a3f444a3d52854"
  Last-Modified: Wed, 28 Jan 2009 13:46:19 GMT
  Content-Type: application/json;charset=utf-8
  Content-Length: 345
  Connection: close
 
  {
    "environments": [
      "\/environments\/sid-x64-base-1.0"
    ],
    "sites": [
      "\/sites\/bordeaux",
      "\/sites\/grenoble",
      "\/sites\/lille",
      "\/sites\/lyon",
      "\/sites\/nancy",
      "\/sites\/orsay",
      "\/sites\/rennes",
      "\/sites\/sophia",
      "\/sites\/toulouse"
    ],
    "uid": "grid5000",
    "type": "grid",
    "uri": "\/"
  }
 
From there, you can discover the composition of the grid by following the URIs to get more information about the sub-resources. For example, you can get more details about Bordeaux by querying:
  GET /sites/bordeaux.json
this will return something like:
  HTTP/1.1 200 OK
  Etag: "45f78b07665ed58f843a741d6927d60a4db35ba3"
  Last-Modified: Wed, 28 Jan 2009 13:46:19 GMT
  Content-Type: application/json;charset=utf-8
  Content-Length: 604
  Connection: close
 
  {
    "environments": [
      "\/sites\/bordeaux\/environments\/sid-x64-base-1.0"
    ],
    "name": "Bordeaux",
    "location": "Bordeaux, France",
    "latitude": null,
    "security_contact": null,
    "clusters": [
      "\/sites\/bordeaux\/clusters\/bordemer",
      "\/sites\/bordeaux\/clusters\/bordeplage",
      "\/sites\/bordeaux\/clusters\/bordereau",
      "\/sites\/bordeaux\/clusters\/borderline"
    ],
    "uid": "bordeaux",
    "type": "site",
    "user_support_contact": null,
    "description": "",
    "longitude": null,
    "email_contact": null,
    "web": null,
    "uri": "\/sites\/bordeaux",
    "sys_admin_contact": null
  }
and so on.


==== Get the list of all versions of a resource ====
As a consequence of using the HTTP protocol as an application protocol, any HTTP client can be used to query the API: command-line tools (cURL), browsers and the numerous HTTP libraries available in the vast majority of programming languages.
  GET /<code class="replace">%resource%</code>/versions.<code class="replace">%format%</code>
* Accepted formats: json
* Response: the list of all changes that were made to the resource.


==== Status codes ====
=== Access to the API ===
* A 200 status code is returned when the request is successful.
The API entry-point is:
* A 304 status code is returned when the requested resource has not been modified (useful for caching purposes).
  https://api.grid5000.fr
* A 404 status code is returned when a resource does not exist.
* A 406 status code is returned when the requested format is not available.
* A 500 status code is returned when the server encountered an error.


=== Getting Started ===
However, during the first phase of the deployment of the Grid5000 API, the access is restricted to requests coming from one of the Grid5000 ACCESS machines.
==== Creating an SSH tunnel ====
Thus, if you are connected on a machine that is not and ACCESS machine, you must create an SSH tunnel between your machine and an ACCESS machine of your choice.
The API is only available from the Grid5000 frontends. As a consequence, the first thing you have to do when trying to query the API is to create an SSH tunnel to the API server, via your site's access machine:
   $ ssh -NL 3443:api.grid5000.fr:443 <code class="replace">login</code>@access.<code class="replace">site</code>.grid5000.fr
   ssh -N -L 8080:131.254.202.98:8080 <code class="replace">login</code>@access.<code class="replace">site</code>.grid5000.fr
If you want to always get the nearest (geographically speaking) ACCESS machine, you may choose to enter:
  $ ssh -NL 3443:api.grid5000.fr:443 <code class="replace">login</code>@access.grid5000.fr
As a consequence, the API entry-point, as seen from your local machine, will be:
  https://localhost:3443


Note: when you are done with your queries, you can hit CTRL-C to destroy the tunnel.
=== Versioning ===
The Grid5000 API is made up of multiple APIs that may evolve independently from each other. Changes occurring in one API are reflected by a new version number for this particular API. In the long term, scripts or applications that are built upon multiple APIs may have to deal with multiple version numbers for each API, which is not a very enjoyable perspective. That's why the Grid5000 API is globally versioned: a global version id will map to the specific API version numbers in a transparent manner. All you have to do is to suffix the API entry-point with the global version id of your choice:
  https://api.grid5000.fr/<code class="replace">global-version-id</code>
Or, if you connect via an SSH tunnel:
  https://localhost:3443/<code class="replace">global-version-id</code>


==== cURL example ====
The format of the global version id is as follows:
Get the latest version of the composition of the platform:
   <code class="replace">major</code>.<code class="replace">minor</code>-<code class="replace">quality</code>
   $ curl -i http://localhost:8080/reference/0_1/.json
e.g.
  1.0-stable


  HTTP/1.1 200 OK
Scripts and applications that are programmed against a specific global version id should never encounter problems due to changes in the APIs, until the official deprecation of the version.
  Date: Thu, 29 Jan 2009 15:23:10 GMT
  Server: Apache
  X-Powered-By: Phusion Passenger (mod_rails/mod_rack) 2.0.6
  Etag: "2e16727e9012f1d12e25010921a6a5fe71bda895"
  Last-Modified: Wed, 28 Jan 2009 13:46:19 GMT
  Content-Length: 345
  Connection: close
  Content-Type: application/json;charset=utf-8
 
  {
    "environments": [
      "\/environments\/sid-x64-base-1.0"
    ],
    "uri": "\/",
    "type": "grid",
    "sites": [
      "\/sites\/bordeaux",
      "\/sites\/grenoble",
      "\/sites\/lille",
      "\/sites\/lyon",
      "\/sites\/nancy",
      "\/sites\/orsay",
      "\/sites\/rennes",
      "\/sites\/sophia",
      "\/sites\/toulouse"
    ],
    "uid": "grid5000"
  }


Get the latest version of the platform with a depth of 2 (this will automatically "resolve" the first level of URIs so that the response body includes the description of the sites and environments):
The list of the past and current versions supported by the Grid5000 API is available at: [[#Global_Version_IDs]].
  $ curl -i http://localhost:8080/reference/0_1/.json?depth=2
 
  HTTP/1.1 200 OK
  Date: Fri, 30 Jan 2009 10:34:59 GMT
  Server: Apache
  X-Powered-By: Phusion Passenger (mod_rails/mod_rack) 2.0.6
  Etag: "9a69b8ce42e1618b0c0dee3112f5269eaaa28d6d"
  Last-Modified: Thu, 29 Jan 2009 16:50:28 GMT
  Content-Length: 6827
  Connection: close
  Content-Type: application/json;charset=utf-8
 
  ... the response body is too long to be displayed here ...
 
Get the composition of the platform as it was on Wed Jan 28 12:11:36 +0100 2009:
  $ curl -i http://localhost:8080/reference/0_1/.json?version=1233141096


  HTTP/1.1 200 OK
=== Resources ===
  Date: Thu, 29 Jan 2009 15:22:20 GMT
The resources that can be queried are described in the documentation of each API.  
  Server: Apache
The resources are identified by a URI, which should be added to the end of the API entry-point to form the URL.
  X-Powered-By: Phusion Passenger (mod_rails/mod_rack) 2.0.6
For example, <code>/sites/rennes/versions/current</code> is a resource exposed by the Reference API. If you would like to get the representation of this resource as it is returned by the <code>1.0-stable</code> version of the Grid5000 API, the URL to enter in your HTTP client would be:
  Etag: "7a518f34482cd7a898b8428bac26eb9a6e457379"
   https://api.grid5000.fr/1.0-stable/sites/rennes/versions/current
  Last-Modified: Wed, 28 Jan 2009 11:11:36 GMT
Or, if you connect via an SSH tunnel:
   Content-Length: 299
  https://localhost:3443/1.0-stable/sites/rennes/versions/current
  Connection: close
  Content-Type: application/json;charset=utf-8
 
  {
    "environments": [
      "\/environments\/sid-x64-base-1.0"
    ],
    "uri": "\/",
    "type": "grid",
    "sites": [
      "\/sites\/grenoble",
      "\/sites\/lille",
      "\/sites\/nancy",
      "\/sites\/orsay",
      "\/sites\/rennes",
      "\/sites\/sophia",
      "\/sites\/toulouse"
    ],
    "uid": "grid5000"
  }


The list of the APIs and their documentation is available at: [[#APIs]].


Get the composition of the platform as it was after the change #30dc4c4fa25ee63c86a093cf6108259a128561b1
=== Format ===
  $ curl -i http://localhost:8080/reference/0_1/.json?version=30dc4c4fa25ee63c86a093cf6108259a128561b1
When you query a REST resource, you ask for a representation of an object. Depending on the resource you are requesting, there could be multiple representations available. It is the responsibility of the user to indicate which representation (or format) it accepts.


  HTTP/1.1 200 OK
To set the format that you accept, you could either explicitly insert the format at the end of the URL, e.g.:
  Date: Fri, 30 Jan 2009 08:48:39 GMT
   GET <code class="replace">entry-point</code>/<code class="replace">global-version-id</code>/<code class="replace">resource</code>.json
  Server: Apache
Or, you may prefer to set the Accept HTTP header to the correct mime type, e.g.:
  X-Powered-By: Phusion Passenger (mod_rails/mod_rack) 2.0.6
   GET <code class="replace">entry-point</code>/<code class="replace">global-version-id</code>/<code class="replace">resource</code>
   Etag: "c6beebf69985bb57642defb4a3e6c0a3a1491783"
   Accept: application/json
  Last-Modified: Wed, 28 Jan 2009 10:27:37 GMT
If you put both, the Accept HTTP header will be ignored.
  Content-Length: 255
  Connection: close
  Content-Type: application/json;charset=utf-8
 
  {
    "environments": [
      "\/environments\/sid-x64-base-1.0"
    ],
    "uri": "\/",
    "type": "grid",
    "sites": [
      "\/sites\/grenoble",
      "\/sites\/orsay",
      "\/sites\/rennes",
      "\/sites\/sophia",
      "\/sites\/toulouse"
    ],
    "uid": "grid5000"
  }
 
Get all the versions of the platform:
  $ curl -i http://localhost:8080/reference/0_1/versions.json
 
Get all the versions of a cluster:
  $ curl -i http://localhost:8080/reference/0_1/sites/rennes/clusters/paramount/versions.json
 
Get the zip archive that contains the files describing the latest version of the rennes platform:
  $ curl http://localhost:8080/reference/0_1/sites/rennes.zip > rennes.zip
 
etc.
 
==== Ruby example ====
First, make sure you've got *Ruby* and *Rubygems* installed. Then install the required gems:
   sudo gem install rest-client json --no-ri --no-rdoc
Put this code in a g5k-reference-api-client.rb file. It will output the current list of sites and, for each one, the list of its clusters:
  require 'pp'
  require 'rubygems'
  require 'rest_client' # sudo gem install rest-client
  require 'json'        # sudo gem install json
 
  api = RestClient::Resource.new('http://localhost:8080/reference/0_1')
  begin
    puts "---- Getting Grid5000"
    # start at the root of the reference data (= grid5000)
    grid5000 = JSON.parse api['/'].get(:accept => 'application/json')
    pp grid5000
    puts "\n---- Getting sites"
    grid5000['sites'].each do |site_uri|
      site = JSON.parse api[site_uri].get(:accept => 'application/json')
      pp site
      puts "\n---- Getting #{site['uid']} clusters"
      site['clusters'].each do |cluster_uri|
        cluster = JSON.parse api[cluster_uri].get(:accept => 'application/json')
        pp cluster
      end
    end
  rescue RestClient::ResourceNotFound
    puts 'Resource not found.'
  rescue RestClient::RequestTimeout
    puts 'Timeout.'
  rescue RestClient::Unauthorized
    puts 'Unauthorized.'
  rescue RestClient::RequestFailed
    puts 'Request failed.'
  rescue RestClient::ServerBrokeConnection
    puts 'Connection broken.'
  rescue Exception => e
    puts e.message
   end
 
Run with:
  ruby g5k-reference-api-client.rb
 
The above example could be simplified by setting the depth parameter to 3:
  ... 
  api = RestClient::Resource.new('http://localhost:8080/reference/0_1')
  begin
    grid5000 = JSON.parse api['/?depth=3'].get(:accept => 'application/json')
    pp grid5000
  rescue
    ...
  end
 
=== Planned ===
* Refine node description scheme;
* Add more data;
* Add XML format;
 
== Caching ==
As described in the [http://tools.ietf.org/html/rfc2616#section-13 RFC2616]:
  HTTP is typically used for distributed information systems, where
  performance can be improved by the use of response caches. The
  HTTP/1.1 protocol includes a number of elements intended to make
  caching work as well as possible.
Hence, client applications can (and should) cache the responses so that subsequent requests for the same information use the cached data. The use of caching strategies can dramatically reduce delays and save bandwidth.
 
That's why most of the responses returned by the Grid5000 APIs include HTTP headers to support one or both of the caching models: expiration-based or validation-based. If you don't know what this means, read this article: http://tomayko.com/writings/things-caches-do.
 
Below are the different schemes that can exist when the API returns cacheable responses.


1. no caching
== APIs ==
  O                          |-----|
=== Reference API ===
  -|-  <-------internet------> | API |
=== Monitoring API ===
  / \                          |-----|
=== Jobset API ===
2. the API builds the response once and stores it into cache for a certain amount of time.
  O                          |-----|      |-----|
  -|-  <-------internet------> |CACHE| <---> | API |
  / \                          |-----|      |-----|
3. the client receives the response once and stores it into cache for a certain amount of time.
  O        |-----|                        |-----|
  -|-  <---> |CACHE| <-------internet------> | API |
  / \        |-----|                        |-----|
4. both the client and the API have a cache in front of them.
  O        |-----|                        |-----|      |-----|
  -|-  <---> |CACHE| <-------internet------> |CACHE| <---> | API |
  / \        |-----|                        |-----|      |-----|


Most of the APIs of Grid5000 will use some kind of caching strategies from their side (scheme 2). It is highly recommended that client applications include a caching strategy in their implementation (scheme 4): this will save bandwidth, reduce latency and may improve the tolerance of the client to network outages.
== Global Version IDs ==


Only a few HTTP libraries natively support client-side caching (e.g.: [http://code.google.com/p/httplib2/ httplib2] in Python). In ruby, there is none that correctly supports all the specifications of the RFC2616. Yet, one can very easily implement a basic caching strategy (in-memory or file-based) or use my [http://gist.github.com/58095 snippet of code], which subclasses the RestClient Resource into a CacheableResource that use the great Rack::Cache library to provide in-memory, file-based or memcached-based transparent caching.
== Tutorials ==


== Links of interest ==
== References ==
* http://json.org/ Introducing JSON
* http://rest-client.heroku.com/rdoc/ The RestClient Ruby gem documentation
* http://tomayko.com/writings/things-caches-do A great write-up on things caches do.

Revision as of 14:56, 17 March 2009


Introduction

At the time of writing, Grid5000 is mainly accessed and operated via shell commands executed from frontend machines. To ease scripting and provide better access to the grid functionalities, an API is currently developed by the Grid5000 Technical Committee. This API consists in a layer on top of most of the Grid5000 tools that will provide stable and well-defined interfaces accessible via the HTTP protocol.

Most of the standardization effort is achieved by trying to be as close as possible to the principles of a REST (REpresentational State Transfer) architecture, which recommends to define the semantics of an operation on a resource by means of the standard HTTP operations (e.g. GET, PUT, POST, DELETE) and the standard HTTP status codes (e.g. 200, 401, 404, 500).

The concept of a REST resource must not be confused with the concept of a resource in the context of Grid5000: a REST resource, identified by a URI (e.g. /sites/rennes), represents a specific view into an object, which MAY incidentally be a Grid5000 resource (e.g. node, switch) but not necessarily.

As a consequence of using the HTTP protocol as an application protocol, any HTTP client can be used to query the API: command-line tools (cURL), browsers and the numerous HTTP libraries available in the vast majority of programming languages.

Access to the API

The API entry-point is:

 https://api.grid5000.fr

However, during the first phase of the deployment of the Grid5000 API, the access is restricted to requests coming from one of the Grid5000 ACCESS machines. Thus, if you are connected on a machine that is not and ACCESS machine, you must create an SSH tunnel between your machine and an ACCESS machine of your choice.

 $ ssh -NL 3443:api.grid5000.fr:443 login@access.site.grid5000.fr

If you want to always get the nearest (geographically speaking) ACCESS machine, you may choose to enter:

 $ ssh -NL 3443:api.grid5000.fr:443 login@access.grid5000.fr

As a consequence, the API entry-point, as seen from your local machine, will be:

 https://localhost:3443

Versioning

The Grid5000 API is made up of multiple APIs that may evolve independently from each other. Changes occurring in one API are reflected by a new version number for this particular API. In the long term, scripts or applications that are built upon multiple APIs may have to deal with multiple version numbers for each API, which is not a very enjoyable perspective. That's why the Grid5000 API is globally versioned: a global version id will map to the specific API version numbers in a transparent manner. All you have to do is to suffix the API entry-point with the global version id of your choice:

 https://api.grid5000.fr/global-version-id

Or, if you connect via an SSH tunnel:

 https://localhost:3443/global-version-id

The format of the global version id is as follows:

 major.minor-quality

e.g.

 1.0-stable

Scripts and applications that are programmed against a specific global version id should never encounter problems due to changes in the APIs, until the official deprecation of the version.

The list of the past and current versions supported by the Grid5000 API is available at: #Global_Version_IDs.

Resources

The resources that can be queried are described in the documentation of each API. The resources are identified by a URI, which should be added to the end of the API entry-point to form the URL. For example, /sites/rennes/versions/current is a resource exposed by the Reference API. If you would like to get the representation of this resource as it is returned by the 1.0-stable version of the Grid5000 API, the URL to enter in your HTTP client would be:

 https://api.grid5000.fr/1.0-stable/sites/rennes/versions/current

Or, if you connect via an SSH tunnel:

 https://localhost:3443/1.0-stable/sites/rennes/versions/current

The list of the APIs and their documentation is available at: #APIs.

Format

When you query a REST resource, you ask for a representation of an object. Depending on the resource you are requesting, there could be multiple representations available. It is the responsibility of the user to indicate which representation (or format) it accepts.

To set the format that you accept, you could either explicitly insert the format at the end of the URL, e.g.:

 GET entry-point/global-version-id/resource.json

Or, you may prefer to set the Accept HTTP header to the correct mime type, e.g.:

 GET entry-point/global-version-id/resource
 Accept: application/json

If you put both, the Accept HTTP header will be ignored.

APIs

Reference API

Monitoring API

Jobset API

Global Version IDs

Tutorials

References