Grid5000:UsagePolicy: Difference between revisions

From Grid5000
Jump to navigation Jump to search
(→‎Rules for the production queue: add link to example reservation)
(Update Usage Policy check to reflect change in the production queue)
 
(8 intermediate revisions by 3 users not shown)
Line 22: Line 22:


= Resources reservation =
= Resources reservation =
Experiments performed on Grid'5000 typically require several resources reservations (or ''tasks'', or ''jobs'').  
Experiments performed on Grid'5000 typically require several resources reservations (called ''jobs'' in the system).  
Resources can be reserved using three different ''queues'' (''default'', ''production'', ''besteffort''), with different usage policies described below. Unless specified otherwise, jobs are submitted in the ''default'' queue. It is possible to reserve resources ''as soon as possible'' (''submissions'', typically for small-scale reservations) or using ''advance reservations'' (for larger-scale experiments, during nights and weekends).
The scheduling of reservations can either be let to the system (''batch'' mode, get resources whenever available given the scheduling policy), or the exact time frame can be given by the user (''advance reservations''). ''Batch'' mode is more suited for automated execution of tasks. ''Advance reservation'' is more suited for interactive experiments, typically programmed at night or during week-ends. The selection of resources supports filters, allowing to specify required features on machines. Grid'5000 is split in several sites, each operated by a different reservation system. A ''job'' cannot reserve resources of several sites at once, but some [[Advanced_OAR#Multi-site_jobs_with_OARGrid|tools]] allows for multi-site reservations.  


Notes:
Resources can be reserved using three different ''queues'' (''default'', ''production'', ''besteffort''), with different usage policies described below. Unless specified otherwise, jobs are submitted in the ''default'' queue. Each queue have their own specific usage policy rules, described in their respective sections, and can be described as follows.  
* On the technical level, resources reservations are handled by the OAR resource manager. When using the command line interface, the ''queue'' is selected using the <code>-q</code> switch (e.g.: <code>oarsub -q besteffort -I</code>).
* The ''default'' queue aims to maximize resources availability during the workday for small or interactive experiments, while allowing for longer jobs to run at night or during the weekends.
* The usage of the testbed is actively monitored by the Grid'5000 staff. In case of usage not following the rules, your account will be locked.
* The ''production'' queue is reserved for specific teams and uses an independent set of clusters for long running experiments.
* [https://intranet.grid5000.fr/report_abuse/ The report abuse form] can be used to report a Grid'5000 usage that does not meet the rules and is preventing you from accessing the resources you need for your work. You can also contact the Grid'5000 staff directly (support-staff@lists.grid5000.fr).
* The ''besteffort'' queue can be used to execute jobs that may be interrupted at any time, whenever resources are requested by another user.
Users can check whether their jobs are within the bounds of the usage policy using the <code class=command>usagepolicycheck</code> tool described [[#Tools_which_can_help_you_respect_the_Policy|here]].
 
Users whose intended usage does not fit within the detailed rules presented above can [[Grid5000:SpecialUsage|request a special permission from the executive committee]].
Exceptions are granted on a regular basis, as can be seen on [[Grid5000:SpecialUsage| the page listing those]].


== Rules for the ''default'' queue ==
== Rules for the ''default'' queue ==
Line 37: Line 41:
# Restrictions applies to the number of allowed reservations in advance and how long in advance they can be made (this is enforced by the resources reservation system).
# Restrictions applies to the number of allowed reservations in advance and how long in advance they can be made (this is enforced by the resources reservation system).
For more details, see the '''[[#Privilege_levels_table|privilege levels table]]'''.
For more details, see the '''[[#Privilege_levels_table|privilege levels table]]'''.
To help with compliance OAR provides the <code class=command>-t day</code> and <code class=command>-t night</code> options to constrain jobs to running during the daytime or night and weekends. Find more information [[News#New_OAR_jobs_types_to_restrict_jobs_to_daytime_or_night.2Fweek-end_time|here.]]


As special exceptions, to allow additional usage of free resources during the day:
As special exceptions, to allow additional usage of free resources during the day:
Line 49: Line 55:


== Rules for the ''production'' queue ==
== Rules for the ''production'' queue ==
The ''production'' queue provides access to a different and smaller set of resources (only located in the Nancy site for time being), with a policy that is more suited to long-running, non-interactive jobs. See [[Nancy:Production|this page]] for more information.
The ''production'' queue provides access to a different and smaller set of resources (only located in the Nancy and Rennes site for time being), with a policy that is more suited to long-running, non-interactive jobs.  


Access to the ''production'' queue is '''restricted''' depending on the privilege level of the user group you are a member of.
Access to the ''production'' queue is '''restricted''' depending on the user group you are a member of.


To reserve resources in the ''production'' queue, see [[Getting_Started#Selecting_specific_resources]].
To reserve resources in the ''production'' queue, see [[Production#Using_production_resources]].


For more details, see the '''[[#Privilege_levels_table|privilege levels table]]'''.
For more details, '''see the dedicated [[Production|Production page]]'''.


== Rules for the ''besteffort'' queue ==
== Rules for the ''besteffort'' queue ==
The ''besteffort'' queue provides a way to submit low-priority, interruptible jobs. Access to the ''besteffort'' queue is not restricted at this point. If you are planning to use the ''besteffort'' queue for large-scale experiments, you should contact the [[Support|Grid'5000 staff]] beforehand.
The [[Advanced_OAR#Using_best_effort_mode_jobs|''besteffort'']] queue provides a way to submit low-priority, interruptible jobs. Access to the ''besteffort'' queue is restricted to members of [[#Privilege_levels_table|silver and gold]] access groups. If you are planning to use the ''besteffort'' queue for large-scale experiments, you should contact the [[Support|Grid'5000 staff]] beforehand.
 
Jobs submitted to the ''besteffort queue'' will only get scheduled on resources when no other job use them. Moreover, these jobs are killed when a regular job recently submitted needs resources used by a ''besteffort'' job.


== Rules for disks reservations ==
== Rules for disks reservations ==
Hard disk drives on nodes can be reserved in order to store datasets between nodes reservations (and avoid moving data to nodes at the beginning of each nodes reservations). See [[Disk reservation]] for more information. The maximum duration for a disk reservation is 14 days. Reservations in advance are allowed.
On some machines, hard disk drives can be reserved in order to store datasets between reservations of the machines (and avoid moving data at the beginning of each machine reservation again). See [[Disk reservation]] for more information. The maximum duration for a disk reservation is 14 days. Reservations in advance are allowed.
 
== Usage Policy violations ==
 
Once a week, user who have violated the usage policy in the previous 7 days will receive an email titled <code>[Grid'5000] <User>'s usage has been violating the Usage Policy</code> detailing their violations. This email is meant as a warning and does not trigger an automatic sanction.
 
[https://intranet.grid5000.fr/report_abuse/ The report abuse form] can be used to report a Grid'5000 usage that does not meet the rules and is preventing you from accessing the resources you need for your work. Users reported in this fashion will receive a mail titled <code>[URGENT] <user> Grid'5000 jobs are going to be terminated as they are currently violating the Usage Policy</code>, and '''must''' immediately correct their resource usage to fall within the bounds of the usage policy. Should a user fail to correct their usage quickly enough, the Grid'5000 teams will kill their jobs. 
 
Generally, the usage of the testbed is actively monitored by the Grid'5000 staff. In case of usage not following the rules, your account will be locked.


= Privilege levels table =
= Privilege levels table =
Line 71: Line 87:
!width="40%" bgcolor="#f5f5f5" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
!width="40%" bgcolor="#f5f5f5" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
Members
Members
!width="6%" bgcolor="#f5f5f5" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
Production queue access
!width="6%" bgcolor="#f5f5f5" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
!width="6%" bgcolor="#f5f5f5" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
Advance reservations per site
Advance reservations per site
Line 89: Line 103:
** Inria research centers: Grenoble - Rhône-Alpes, Lille - Nord Europe, Nancy - Grand Est, Rennes - Bretagne Atlantique, Sophia Antipolis - Méditerranée
** Inria research centers: Grenoble - Rhône-Alpes, Lille - Nord Europe, Nancy - Grand Est, Rennes - Bretagne Atlantique, Sophia Antipolis - Méditerranée
* Paying users (industry, fed4fire open calls)
* Paying users (industry, fed4fire open calls)
|width="6%" bgcolor="gold" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
yes, full
|width="6%" bgcolor="gold" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
|width="6%" bgcolor="gold" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
2
2
Line 105: Line 117:
* teams from other labs at Grid'5000 sites (for example, same university that participated in funding)
* teams from other labs at Grid'5000 sites (for example, same university that participated in funding)
* other partners, such as ''fed4fire''
* other partners, such as ''fed4fire''
|width="6%" bgcolor="silver" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
yes, but walltime limited to 24h
|width="6%" bgcolor="silver" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
|width="6%" bgcolor="silver" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
2
2
Line 120: Line 130:
|width="40%" bgcolor="peru" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"  align="left" |
|width="40%" bgcolor="peru" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"  align="left" |
* all other labs (and open access users)
* all other labs (and open access users)
|width="6%" bgcolor="peru" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
no
|width="6%" bgcolor="peru" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
|width="6%" bgcolor="peru" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
1
1
Line 140: Line 148:


== Usage of the Internet access ==
== Usage of the Internet access ==
Usage of the Internet from Grid'5000 is actively monitored and logged both by the Grid'5000 staff, and by Grid'5000 upstream network provider RENATER. While Internet access is mostly unfiltered, users should contact the [[Support|Grid'5000 staff]] if their usage is likely to raise alerts (for example, experiments involving network traffic that could be considered as unauthorized usage of RENATER).
Usage of the Internet from Grid'5000 is actively monitored and logged both by the Grid'5000 staff, and by Grid'5000 upstream network provider RENATER. While Internet access is mostly unfiltered, users must contact the [[Support|Grid'5000 staff]] if their usage is likely to raise alerts (for example, experiments involving network traffic that could be considered as unauthorized usage of RENATER).


== Crypto-currency mining and other usages generating revenue ==
== Crypto-currency mining and other usages generating revenue ==
Line 151: Line 159:
The use of Grid'5000 resources must not generate direct revenue to the user.
The use of Grid'5000 resources must not generate direct revenue to the user.


== Changing the BIOS or BMC setting or flashing firmware is not allowed ==
Changing the BIOS or Board Management Cards (BMC) / IPMI settings of Grid'5000 nodes is not
allowed. Neither is flashing any firmware (BIOS, network card, storage device, etc) of nodes. Please contact the technical team if your experiment would require it.
= Mailing lists =
= Mailing lists =
As a Grid'5000 user you are automatically subscribed to the Grid'5000 users' mailing lists. The traffic is not very high, so please keep an eye on those emails as they may contain important information (see [[Mailing lists]] for more information). More than 800 users are subscribed to those lists, so please be careful before asking questions on the lists.
As a Grid'5000 user you are automatically subscribed to the Grid'5000 users' mailing lists. The traffic is not very high, so please keep an eye on those emails as they may contain important information (see [[Mailing lists]] for more information). More than 800 users are subscribed to those lists, so please be careful before asking questions on the lists.

Latest revision as of 07:38, 12 June 2024


General principles

Grid'5000 is a scientific instrument supporting experiment-driven research in all areas of computer science, with a focus on distributed computing, cloud computing, high performance computing, big data and networking. Its use should lead to scientific results or contribute to education in this area. Other uses require a special permission from the executive committee. Unauthorized uses might result in prosecution. Access policy is detailed on that page.

This document defines rules to allow the shared use of this infrastructure by different communities of users, with different needs. If your intended usage does not fit within the detailed rules presented below, you can request a special permission from the executive committee. Exceptions are granted on a regular basis, as can be seen on the page listing those.

The Grid'5000 General Conditions of Use also specify the legal frame and the application of the law.

Acknowledging Grid'5000 usage

The sustainability of the Grid'5000 testbed is ensured through regular funding requests, for which information must be provided about usage and results obtained by experimenters. Grid'5000 users must contribute to this effort by providing information about their usage upon request and in a timely manner. Example of such requests include highlights, 1-2-pages usage reports (for the major consumers), short slides deck about an experiment, etc.

Additionally, Grid'5000 must be acknowledged in all publications presenting results or contents obtained or derived from the usage of Grid'5000. All those publications must be added to the Grid'5000 collection on the HAL Open Archive. This can be achieved by adding your publication to HAL (possibly without the full text), and specifying "GRID5000" in the collaboration field. The official acknowledgment to use in your publication is the following:

Experiments presented in this paper were carried out using the Grid'5000 testbed, supported by a scientific interest group hosted by Inria and including CNRS, RENATER and several Universities as well as other organizations (see https://www.grid5000.fr).

Resources reservation

Experiments performed on Grid'5000 typically require several resources reservations (called jobs in the system). The scheduling of reservations can either be let to the system (batch mode, get resources whenever available given the scheduling policy), or the exact time frame can be given by the user (advance reservations). Batch mode is more suited for automated execution of tasks. Advance reservation is more suited for interactive experiments, typically programmed at night or during week-ends. The selection of resources supports filters, allowing to specify required features on machines. Grid'5000 is split in several sites, each operated by a different reservation system. A job cannot reserve resources of several sites at once, but some tools allows for multi-site reservations.

Resources can be reserved using three different queues (default, production, besteffort), with different usage policies described below. Unless specified otherwise, jobs are submitted in the default queue. Each queue have their own specific usage policy rules, described in their respective sections, and can be described as follows.

  • The default queue aims to maximize resources availability during the workday for small or interactive experiments, while allowing for longer jobs to run at night or during the weekends.
  • The production queue is reserved for specific teams and uses an independent set of clusters for long running experiments.
  • The besteffort queue can be used to execute jobs that may be interrupted at any time, whenever resources are requested by another user.

Users can check whether their jobs are within the bounds of the usage policy using the usagepolicycheck tool described here.

Users whose intended usage does not fit within the detailed rules presented above can request a special permission from the executive committee. Exceptions are granted on a regular basis, as can be seen on the page listing those.

Rules for the default queue

Note.png Note

In all this document, time is expressed in the Europe/Paris timezone.

Daytime is dedicated to smaller-scale experiments, and preparatory work for large-scale experiments. Large-scale jobs must be executed during nights or weekends (generally, using advance reservations). Specifically:

  1. Between 09:00 and 19:00 during working days (Monday to Friday, excluding public holidays in France), you should not use more than the equivalent of 2 hours on all the cores of the cluster during a given day (e.g. on a 64 bi-processor (quad core) cluster, you should not use more than (2 hours)*(2 CPU)*(4 cores)*(64 nodes)= 1024 core.hours).
  2. Your jobs must not cross the 09:00 and 19:00 boundaries during week days (to extend an overnight reservation, for example). (This also means that you must not have jobs that last more than 14 hours outside weekends.)
  3. Restrictions applies to the number of allowed reservations in advance and how long in advance they can be made (this is enforced by the resources reservation system).

For more details, see the privilege levels table.

To help with compliance OAR provides the -t day and -t night options to constrain jobs to running during the daytime or night and weekends. Find more information here.

As special exceptions, to allow additional usage of free resources during the day:

  • Jobs of duration shorter or equal to one hour, whose submission is done less than 10 minutes before the job starts, are excluded from daily quotas.
    • This means that one can always reserve resources for up to one hour when they are immediately available.
  • Similarly, job extensions requested less than 10 minutes before the end of the job, and for a duration of one hour or less, are also excluded from daily quotas. Those extensions can be renewed several times (always during the last 10 minutes of the job).
    • This means that, when resources are still available, one can always extend jobs for up to one hour.
  • Crossing the 19:00 boundary is allowed for jobs submitted at or after 17:00 the same day. The portion of those jobs from 17:00 to 19:00 is excluded from daily quotas. This exception also applies to job extensions.
    • This means that if at 17:00 or later on a given day, resources are not reserved for the following night, then it is possible to reserve them and start the night job earlier or extend an existing job for the whole night.
  • Crossing the 9:00 boundary is allowed for jobs submitted on the same day. But the portion of those jobs after 9:00 is still included in the daily quota.
    • This means that when resources are free in the morning, people are free to start working earlier.

Rules for the production queue

The production queue provides access to a different and smaller set of resources (only located in the Nancy and Rennes site for time being), with a policy that is more suited to long-running, non-interactive jobs.

Access to the production queue is restricted depending on the user group you are a member of.

To reserve resources in the production queue, see Production#Using_production_resources.

For more details, see the dedicated Production page.

Rules for the besteffort queue

The besteffort queue provides a way to submit low-priority, interruptible jobs. Access to the besteffort queue is restricted to members of silver and gold access groups. If you are planning to use the besteffort queue for large-scale experiments, you should contact the Grid'5000 staff beforehand.

Jobs submitted to the besteffort queue will only get scheduled on resources when no other job use them. Moreover, these jobs are killed when a regular job recently submitted needs resources used by a besteffort job.

Rules for disks reservations

On some machines, hard disk drives can be reserved in order to store datasets between reservations of the machines (and avoid moving data at the beginning of each machine reservation again). See Disk reservation for more information. The maximum duration for a disk reservation is 14 days. Reservations in advance are allowed.

Usage Policy violations

Once a week, user who have violated the usage policy in the previous 7 days will receive an email titled [Grid'5000] <User>'s usage has been violating the Usage Policy detailing their violations. This email is meant as a warning and does not trigger an automatic sanction.

The report abuse form can be used to report a Grid'5000 usage that does not meet the rules and is preventing you from accessing the resources you need for your work. Users reported in this fashion will receive a mail titled [URGENT] <user> Grid'5000 jobs are going to be terminated as they are currently violating the Usage Policy, and must immediately correct their resource usage to fall within the bounds of the usage policy. Should a user fail to correct their usage quickly enough, the Grid'5000 teams will kill their jobs.

Generally, the usage of the testbed is actively monitored by the Grid'5000 staff. In case of usage not following the rules, your account will be locked.

Privilege levels table

Level

Members

Advance reservations per site

Max delay before reservations

Best-effort queue access

Notes

Gold

  • Members of labs / research that contribute to the infra by providing funding for resources or manpower, and direct collaborators. As of 2018-03-01, those are:
    • Laboratories: I3S (Sophia Antipolis), IRISA (Rennes), IRIT (Toulouse), LIFL (Lille), LIG (Grenoble), LIP (Lyon), LORIA (Nancy), CSC (Luxembourg)
    • Inria research centers: Grenoble - Rhône-Alpes, Lille - Nord Europe, Nancy - Grand Est, Rennes - Bretagne Atlantique, Sophia Antipolis - Méditerranée
  • Paying users (industry, fed4fire open calls)

2

no limit

yes

Silver

  • teams from other Inria research centers (because Inria contributes globally through engineer positions and sometimes hardware budget)
  • teams from other labs at Grid'5000 sites (for example, same university that participated in funding)
  • other partners, such as fed4fire

2

48h

yes

"max delay before reservation" is unlimited for managers of groups of the tutorials site to allow teachers to reserve in advance for tutorial or teaching labs. See Tutorial or Teaching Labs HowTo for details.

Bronze

  • all other labs (and open access users)

1

24h

no

Notes:

  • A gold or silver or bronze level is associated to every Grid'5000 user group that grants access to the platform. Thus, every Grid'5000 user, which has an effective access to the platform, is a member of at least one of such a group and makes resource reservations attached to such a group. Therefor, resources reservations of every user are constrained by the privilege level of the user group attached to them, as described in the table above.
  • A document in French explains the privilege level policy in more details.

Special cases

Special permissions

As a reminder, if your intended usage does not fit within the detailed rules presented above, you can request a special permission from the executive committee. Exceptions are granted on a regular basis, as can be seen on the page listing those.

Usage of the Internet access

Usage of the Internet from Grid'5000 is actively monitored and logged both by the Grid'5000 staff, and by Grid'5000 upstream network provider RENATER. While Internet access is mostly unfiltered, users must contact the Grid'5000 staff if their usage is likely to raise alerts (for example, experiments involving network traffic that could be considered as unauthorized usage of RENATER).

Crypto-currency mining and other usages generating revenue

In the case of experiments involving crypto-currency mining or other actions that might generate revenue as a side effect, experimenters must ask the Grid'5000 staff (support-staff@lists.grid5000.fr) for credentials to use so that potential revenue is directed to an account controlled by the Grid'5000 organization. The use of Grid'5000 resources must not generate direct revenue to the user.

Changing the BIOS or BMC setting or flashing firmware is not allowed

Changing the BIOS or Board Management Cards (BMC) / IPMI settings of Grid'5000 nodes is not allowed. Neither is flashing any firmware (BIOS, network card, storage device, etc) of nodes. Please contact the technical team if your experiment would require it.

Mailing lists

As a Grid'5000 user you are automatically subscribed to the Grid'5000 users' mailing lists. The traffic is not very high, so please keep an eye on those emails as they may contain important information (see Mailing lists for more information). More than 800 users are subscribed to those lists, so please be careful before asking questions on the lists.

Tools which can help you respect the Policy

The usagepolicycheck tool is available on all frontends. It allows to check whether your usage of the platform respect the policy or not. It can be useful to launch it after you've submitted your jobs to make sure you aren't violating any rules. Here are some usage examples :

You can check current usage with
  usagepolicycheck -t

You can discover daily allowance with
  usagepolicycheck -l [--sites site1,sites2]

You can check your usage on a whole week (here the first week of january as an example) using:
  usagepolicycheck -v --start '2019-01-01 10:00:15 +0100' --end '2019-01-15 10:00:15 +0100' on any frontends

Note that usagepolicycheck is automatically run on all frontends every week and reports usage policy violations to the offending users and Grid'5000 administrators.