API: Difference between revisions

From Grid5000
Jump to navigation Jump to search
Line 35: Line 35:
You are advised to retrieve manually and keep the last commit identifier of the git repository for later references.
You are advised to retrieve manually and keep the last commit identifier of the git repository for later references.


To do so, access
To do so, access https://api.github.com/repos/grid5000/reference-repository/git/refs/heads/master
<code>
 
https://api.github.com/repos/grid5000/reference-repository/git/refs/heads/master
 
=>
{
     "ref": "refs/heads/master",
     "ref": "refs/heads/master",
     "url": "https://api.github.com/repos/grid5000/reference-repository/git/refs/heads/master",
     "url": "https://api.github.com/repos/grid5000/reference-repository/git/refs/heads/master",
Line 47: Line 45:
         "url": "https://api.github.com/repos/grid5000/reference-repository/git/commits/fa6830cb904414522929fcea57d2941404e31621"
         "url": "https://api.github.com/repos/grid5000/reference-repository/git/commits/fa6830cb904414522929fcea57d2941404e31621"
     }
     }
}</code>


Save the "sha" property.
and save the "sha" property.


It is not possible to access the data at the time of a past your experiment through the API because it only serves the latest state.
It is not possible to access the data at the time of a past your experiment through the API because it only serves the latest state.
To access such data, go through the git repository directly.
To access such data, go must go through the git repository directly.


For example: <code>https://github.com/grid5000/reference-repository/tree/fa6830cb904414522929fcea57d2941404e31621/data/grid5000</code>
For example: <code>https://github.com/grid5000/reference-repository/tree/fa6830cb904414522929fcea57d2941404e31621/data/grid5000</code>

Revision as of 14:55, 7 March 2017

Introduction

Until 2009, Grid'5000 was mainly accessed and operated via shell commands executed from frontend machines. To ease scripting and provide better access to the grid functionalities, an API has been developed on top of most of the Grid'5000 tools by the Grid'5000 development team and is available to users since November 2009.

To achieve ease of use and world-wide accessibility, we use the HTTP protocol (RFC2616). As a consequence of using the HTTP protocol as an application protocol, any HTTP client can be used to query the API: command-line tools (cURL), browsers, and the numerous HTTP libraries available in your favorite programming language.

We have chosen to be as close as possible to the principles of a REST (REpresentational State Transfer) architecture, which recommends to define the semantics of an operation on a resource by means of the standard HTTP operations (e.g. POST, GET, PUT, DELETE) and the standard HTTP status codes (e.g. 200, 401, 404, 500). REST usages include, but are not limited to, implementing CRUD (Create, Read, Update, Delete) semantics over the network. We also implement the concept of "Hypermedia as the Engine of Application State" (HATEOAS), by specifying a set of hyperlinks in all responses returned by the API, which allows a user agent to discover at runtime the set of available resources as well as their semantics and content types, and transition from one resource to another.

Security is ensured by encrypting the traffic using HTTP over SSL/TLS (HTTPS, RFC5246), and requesting login:password credentials using the HTTP Basic Authentication (RFC2617) scheme. A mutual authentication scheme using SSL Client Certificates has been envisioned, but not implemented due to lack of resources.

Finally, fault tolerance and scalability are ensured by distributing all the APIs over the Grid'5000 sites, and using proven HTTP technology such as load balancers and proxies to monitor the application servers and route the traffic to alive servers.

Overview of the Current API Landscape

Api Overview.png

How to Use the API

The API entry-point is (both from within Grid'5000 and outside):

 https://api.grid5000.fr/

Platform state and reproducibility

The API data is based on a central Grid5000 git repository which contains numerous characteristics about nodes. https://github.com/grid5000/reference-repository

Those characteristics can change over time and affect performances or other sensible elements, thus it is important to keep a reference of the state of the nodes you used at the time of an experiment.

At the moment, there is no automatic association between an experiment and the state of used nodes.

You are advised to retrieve manually and keep the last commit identifier of the git repository for later references.

To do so, access https://api.github.com/repos/grid5000/reference-repository/git/refs/heads/master


   "ref": "refs/heads/master",
   "url": "https://api.github.com/repos/grid5000/reference-repository/git/refs/heads/master",
   "object": {
       "sha": "fa6830cb904414522929fcea57d2941404e31621",
       "type": "commit",
       "url": "https://api.github.com/repos/grid5000/reference-repository/git/commits/fa6830cb904414522929fcea57d2941404e31621"
   }

and save the "sha" property.

It is not possible to access the data at the time of a past your experiment through the API because it only serves the latest state. To access such data, go must go through the git repository directly.

For example: https://github.com/grid5000/reference-repository/tree/fa6830cb904414522929fcea57d2941404e31621/data/grid5000

HTTP

To understand how the API works, it is important to know how the HTTP protocol works. For an in-depth view, please refer to RFC 2616

In short, a client sends a request to the server, with control metadata added as HTTP headers, and the server sends back a response. For instance:

Example of HTTP Request (we Accept any kind of content, and we provide Authorization credentials):

 GET /2.1/grid5000 HTTP/1.1
 Authorization: Basic xxx
 User-Agent: curl/7.21.2 (x86_64-apple-darwin10.4.0) libcurl/7.21.2 OpenSSL/1.0.0a zlib/1.2.5 libidn/1.19
 Host: api.grid5000.fr
 Accept: */*

Example of HTTP Response:

HTTP/1.1  200 OK
Date: Thu, 29 Nov 2012 13:50:13 GMT
Server: Apache/2.2.21 (Debian)
X-Powered-By: Phusion Passenger (mod_rails/mod_rack) 2.2.3
Vary: Accept,Accept-Encoding
ETag: "e1b4d4072ab6fe1099ffe9af75278b03d069043d"
Allow: GET
Cache-Control: public, must-revalidate, proxy-revalidate, max-age=60, s-maxage=60
Last-Modified: Thu, 29 Nov 2012 13:41:38 GMT
Content-Length: 2527
Status: 200
Content-Type: application/vnd.fr.grid5000.api.site+json;level=1
X-Cache: MISS from api-proxy.rennes.grid5000.fr
X-Cache-Lookup: MISS from api-proxy.rennes.grid5000.fr:3128
Via: 1.0 api-server.bordeaux.grid5000.fr, 1.0 api-proxy.rennes.grid5000.fr:3128 (squid/2.6.STABLE18)
Via: 1.1 api-proxy.rennes.grid5000.fr:4444

{
  "uid": "rennes",
  "type": "site",
  "version": "7bcfecfda77eaee7822791b62685f70f9f50770f",
  "links": [
    {
      "href": "/2.1/grid5000/sites/rennes/versions/7bcfecfda77eaee7822791b62685f70f9f50770f",
      "title": "version",
      "rel": "member",
      "type": "application/vnd.fr.grid5000.api.Version+json;level=1"
    },
    {
      "href": "/2.1/grid5000/sites/rennes/versions",
      "title": "versions",
      "rel": "collection",
      "type": "application/vnd.fr.grid5000.api.Collection+json;level=1"
    },
    {
      "href": "/2.1/grid5000/sites/rennes",
      "rel": "self",
      "type": "application/vnd.fr.grid5000.api.Site+json;level=1"
   },
    {
      "href": "/2.1/grid5000/sites/rennes/clusters",
      "title": "clusters",
      "rel": "collection",
      "type": "application/vnd.fr.grid5000.api.Collection+json;level=1"
    },
    {
      "href": "/2.1/grid5000/sites/rennes/environments",
      "title": "environments",
      "rel": "collection",
      "type": "application/vnd.fr.grid5000.api.Collection+json;level=1"
    },
    {
      "href": "/2.1/grid5000/sites/rennes/network_equipments",
      "title": "network_equipments",
      "rel": "collection",
      "type": "application/vnd.fr.grid5000.api.Collection+json;level=1"
    },
    {
      "href": "/2.1/grid5000/sites/rennes/pdus",
      "title": "pdus",
      "rel": "collection",
      "type": "application/vnd.fr.grid5000.api.Collection+json;level=1"
    },
    {
      "href": "/2.1/grid5000",
      "rel": "parent",
      "type": "application/vnd.fr.grid5000.api.Grid+json;level=1"
    },
    {
      "href": "/2.1/grid5000/sites/rennes/status",
      "title": "status",
      "rel": "collection",
      "type": "application/vnd.fr.grid5000.api.Collection+json;level=1"
    },
    {
      "href": "/2.1/grid5000/sites/rennes/metrics",
      "title": "metrics",
      "rel": "collection",
      "type": "application/vnd.fr.grid5000.api.Collection+json;level=1"
    },
    {
      "href": "/2.1/grid5000/sites/rennes/jobs",
      "title": "jobs",
      "rel": "collection",
      "type": "application/vnd.fr.grid5000.api.Collection+json;level=1"
    },
    {
      "href": "/2.1/grid5000/sites/rennes/deployments",
      "title": "deployments",
      "rel": "collection",
      "type": "application/vnd.fr.grid5000.api.Collection+json;level=1"
    },
    {
      "href": "/2.1/grid5000/sites/rennes/vlans",
      "title": "vlans",
      "rel": "collection",
      "type": "application/vnd.grid5000.collection+json"
    }
  ]
}

Authentication

From outside Grid'5000

  • When accessing the API from outside Grid'5000, you MUST send your Grid'5000 credentials (login and password) via the use of the HTTP Basic Authentication mechanism (every browser and the vast majority of HTTP libraries support this mechanism). Example with cURL:
 crohr@local-machine:~$ curl -k -u mylogin:mypassword -X POST https://api.grid5000.fr/stable/grid5000/sites/rennes/jobs -d "command=sleep 100"
  • You may also choose to set up an SSH tunnel with one of the access machines available from the Internet, so that the authentication of your requests is handled transparently for you (see below #How to set up an SSH Tunnel).

From within Grid'5000

  • From within Grid'5000, you are transparently authenticated IF AND ONLY IF you connect from a frontend or access machine.
 crohr@frontend:~$ curl -k -X POST https://api.grid5000.fr/stable/grid5000/sites/rennes/jobs -d "command=sleep 100"
  • Connections from other Grid'5000 machines require setting up an SSH tunnel between your node and a frontend machine (see #How to set up an SSH Tunnel).

Versioning

The Grid'5000 API is made up of multiple APIs that may evolve independently from each other. Changes occurring in one API are reflected by a new version number for this particular API. In the long term, scripts or applications that are built upon multiple APIs may have to deal with multiple version numbers for each API, which is not a very enjoyable perspective. That's why the Grid'5000 API is globally versioned: a global version id will map to the specific API version numbers in a transparent manner. All you have to do is to suffix the API entry-point with the global version id of your choice:

 https://api.grid5000.fr/global-version-id

See https://api.grid5000.fr for the list of available global version ids. Starting from the 2.0 version, the format of the global version ids for STABLE versions will be as follows:

 major.minor

The latest stable version is available under the alias stable.

The unstable version will always be called sid.

Scripts and applications that are programmed against a specific global version id of STABLE quality should never encounter problems due to changes in the APIs, until the official deprecation of the version.

Resources

The resources that can be queried are described in the documentation of each API. The resources are identified by a relative URI, which should be added to the end of the API entry-point to form the URL.

For example, /sites/rennes is a resource exposed by the Reference API. If you would like to get the representation of this resource as it is returned by the 3.0 version of the Grid'5000 API, the URL to enter in your HTTP client would be:

 https://api.grid5000.fr/3.0/sites/rennes

Content Negotiation

When you query a resource, you ask for a representation of an object. Depending on the resource you are requesting, there could be multiple representations (of different content types) available (e.g. XML, JSON, PDF, HTML, etc.). It is the responsibility of the user to indicate which representation (or format) it wants.

To set the format that you accept, the Grid'5000 APIs support two methods:

  • you could either explicitly insert the format at the end of the URL, e.g.:
 GET entry-point/global-version-id/resource.json
  • or, you may prefer to set the Accept HTTP header to the correct mime type, e.g.:
 GET entry-point/global-version-id/resource
 Accept: application/json

If you put both, the Accept HTTP header will be ignored.

Status Codes

Every response that you'll get from the APIs will contain a status code. In Grid'5000 we use a subset of the standards HTTP status codes [1]. You may encounter one of the following:

  • A 200 status code is returned when the request is successful.
  • A 201 status code is returned when the request has been fulfilled and resulted in a new resource being created.
  • A 202 status code is returned when the request has been accepted. The real processing may happen later.
  • A 304 status code is returned on a conditional GET request [2] when the requested resource has not been modified since the last access.
  • A 400 status code is returned when something in the request of the user agent is not correct.
  • A 401 status code is returned when the user agent needs to authenticate.
  • A 403 status code is returned when the access to the requested resource is forbidden to the currently logged user.
  • A 404 status code is returned when a resource does not exist.
  • A 405 status code is returned when the user agent uses an HTTP method not supported by the resource.
  • A 406 status code is returned when the requested format is not available.
  • A 415 status code is returned when the payload of the request uses an unsupported content-type (e.g. application/xml instead of application/json).
  • A 500 status code is returned when the server encountered an error.
  • A 503 status code is returned when the service is not available.
  • A 504 status code is returned if the request took too much time to complete.

The body of responses whose status code is 4xx or 5xx may contain additional information about the error.


Caching

Warning.png Warning

On a first read, this section can be skipped without any regrets. This section is targeted at advanced users/developers interested in improving the performance of their client applications.

As described in the RFC2616:

  HTTP is typically used for distributed information systems, where
  performance can be improved by the use of response caches. The
  HTTP/1.1 protocol includes a number of elements intended to make
  caching work as well as possible.

Hence, client applications can (and should) cache the responses so that subsequent requests for the same information use the cached data. The use of caching strategies can dramatically reduce delays and save bandwidth.

That's why most of the responses returned by the Grid'5000 APIs include HTTP headers to support one or both of the caching models: expiration-based or validation-based. If you don't know what this means, read this article: http://tomayko.com/writings/things-caches-do.

Below are the different schemes that can exist when the API returns cacheable responses.

1. no caching

  O                           |-----|
 -|-  <-------internet------> | API |
 / \                          |-----| 

2. the API builds the response once and stores it into cache for a certain amount of time.

  O                           |-----|       |-----|
 -|-  <-------internet------> |CACHE| <---> | API |
 / \                          |-----|       |-----|

3. the client receives the response once and stores it into cache for a certain amount of time.

  O         |-----|                         |-----|
 -|-  <---> |CACHE| <-------internet------> | API |
 / \        |-----|                         |-----|

4. both the client and the API have a cache in front of them.

  O         |-----|                         |-----|       |-----|
 -|-  <---> |CACHE| <-------internet------> |CACHE| <---> | API |
 / \        |-----|                         |-----|       |-----|

Most of the APIs of Grid'5000 will use some kind of caching strategies from their side (scheme 2). It is recommended that client applications include a caching strategy in their implementation (scheme 4): this will save bandwidth, reduce latency and may improve the tolerance of the client to network outages.

Note that only a few HTTP libraries natively support client-side caching (e.g.: httplib2 in Python). For ruby, the rest-client-components gem offer this functionality.

FAQ & Howtos

"Certificate verification failed"

Your HTTP client may produce a warning or refuses to connect if your client dos not trust the certification authority (CA) that has signed our certificate. We are using a widely deployed CA (TERENA), so this should not happen. Please check your configuration before disabling certificate validation.

How to set up an SSH Tunnel

Replace login, machine and site in the following command:

 $ ssh -NL 3443:api.grid5000.fr:443 login@machine.site.grid5000.fr

For example, if you want to access the API from a Grid'5000 node (within Grid'5000), you might set up an SSH tunnel as follows:

 grid5000-node $ ssh -NL 3443:api.grid5000.fr:443 your_user@frontend

On the contrary, from outside Grid'5000 you would set up an SSH tunnel as follows:

 your-machine $ ssh -NL 3443:api.grid5000.fr:443 your_user@access.grid5000.fr

As a consequence, the API entry-point, as seen from your machine, will become:

 https://localhost:3443

Thus, an example of request would be:

 $ curl -k -X POST https://localhost:3443/sid/grid5000/sites/rennes/jobs -d "command=sleep 100"

When you're done with the API, hit CTRL-C to destroy the tunnel, unless you added the -f flag to run it in the background.

Resources

References