Production

From Grid5000
Revision as of 18:29, 5 February 2025 by Lmirtain (talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search


Note.png Note

2025-01-30 - A specific documentation Web site for Abaca will go live shortly. In the meantime, specific pages for “Production” use are hosted in the Grid'5000 documentation.

Introduction

Note.png Note

Abaca is the name of Inria's national computing infrastructure dedicated to production applications.

Abaca clusters are hosted on Inria sites alongside clusters dedicated to the Grid'5000 platform. Abaca and Grid'5000 use the same technical management tools, and the Abaca and Grid'5000 support teams work together to administer both platforms.

In the remainder of this document, “Production” refers to the use of the Abaca platform.

The Abaca usage rules differ from the rest of Grid'5000.

Using Production resources

Getting an account

Users from the Inria research centres that want to access for a production usage must use that request form to open an account, like regular Grid'5000 users.

  • The following fields must be filled as follows:
    • Group Granting Access (GGA): either the group named after the research team
    • Laboratory: the name of your Inria research center or LORIA or IRISA
    • Team: the name of your research team.

Other users from Nancy (not belonging to the Loria laboratory) can ask to join using the nancy-misc Group Granting Access while other users from Rennes (not belonging to the Irisa laboratory) can ask to join using the rennes-misc Group Granting Access.

  • Users are automatically subscribed to the Grid'5000 users mailing lists: users@lists.grid5000.fr. This list is the user-to-user or user-to-admin communication mean to address help/support requests for Grid'5000. The technical team can be reached on support-staff@lists.grid5000.fr.

Visualizing resources

Note.png Note

At that date (2025-02-01), only the Nancy, Rennes, Grenoble and Sophia sites host clusters Production use (Abaca).

See Hardware to learn about the site's resources and your priority access to resources.

Using resources

The Production usage rules differ from the rest of Grid'5000:

  • Advance reservations (oarsub -r) are not allowed (to avoid fragmentation). Only submissions (and reservations that start immediately) are allowed.
  • All Grid'5000 users can use Production nodes (provided they meet the conditions stated in Grid5000:UsagePolicy), but it is expected that users will use their local Production resources in priority, and mostly use those resources for tasks that require Grid'5000 features.

To access production resources, you need to submit jobs to the production queue using the command -q production. Job submissions in the production queue are prioritized based on who funded the material. There are four levels of priority, each with a maximum job duration:

  • p1 -- 168h (one week)
  • p2 -- 96h (four days)
  • p3 -- 48h (two days)
  • p4 -- 24h (one day)
  • You may also have access to the clusters on besteffort.

You can check your priority level for any cluster using https://api.grid5000.fr/explorer.

Note.png Note

Moreover, with p1 priority, user can submit advanced reservation. More information about that in the Advanced OAR Page. For example, to reserve one week from now:

Terminal.png fnancy:
oarsub -q p1 -r "$(date +'%F %T' --date='+1 week')"
p1 priority level also allow to extend the duration of a job. The extension is only apply 24h before the end of the job and cannot be longer than 168h. More information about this feature can be found also on the Advance Oar Page.

Warning.png Warning

These limits DO NOT replace the maximum walltime per node which are still in effects.

Note.png Note

As of today, the resources explorer only shows basic information. Additional information will be added in the near future.

When submitting a job, by default, you will be placed at the highest priority level that allows you to maximize resources:

Terminal.png fnancy:
oarsub -q production -I

Using the command above will generally place your job at the lowest priority to allow usage of all clusters, even those where your priority is p4.

When you specify a cluster, your job will be set to your highest priority level for that cluster:

Terminal.png fnancy:
oarsub -q production -p grele -I

You can also limit a job submission to a cluster at a specific priority level using -qPRIORITY LEVEL:

Terminal.png fnancy:
oarsub -q p2 -l nodes=2,walltime=90 './yourScript.py'

Dashboards and status pages

Resources reservations (OAR) status

Drawgantt (past, current and future OAR jobs scheduling)

Grenoble nodes (production)

Nancy nodes (production)

Rennes nodes (production)

Sophia nodes (production)

Monika (current placement and queued jobs status)

Grenoble (production)

Nancy (production)

Rennes (production)

Sophia (production)

Learning to use Production

Refer to the Production:Getting Started Production tutorial (derived from Getting Started Grid'5000 tutorial).

Information and support

Note.png Note

For the time being, access to support is common to both Abaca and Grid'5000....

Before asking for support, you're advised to verify your issue is not documented somewhere on the Grid'5000 website. In particular, you should check:

You may contact the Support staff directly by sending an e-mail to support-staff@lists.grid5000.fr.