Status: Difference between revisions

From Grid5000
Jump to navigation Jump to search
(re-organize info)
No edit summary
(150 intermediate revisions by 26 users not shown)
Line 1: Line 1:
= Grid administration schedule =
__NOEDITSECTION__
If you experience problems, please [[Current_events|check the grid administration schedule]], where past, present and future incidents (planned or not...) are notified for all sites:
{{Status|In production}}
{{Link|text=[[Current_events|Sidebar # Users Portal > Platform events]]}}
{{Portal|User}}
{{Portal|Platform}}


{{Note|text=As far as possible most of the administration operations are planned on thursday mornings.}}
= [https://www.grid5000.fr/status/ Current platform events] (maintenance, outages, issues...) =
If you experience problems, please check '''[https://www.grid5000.fr/status/ the platform's operation schedule]''' ''(Past, present and future incidents (planned or not...) are notified for all sites).''


= Problems =
[[#Monitoring_tools|Monitoring tools]] are available to help you trace your experiment.


If you still have questions, please:
For other long running minor issue that may affect your experiment, you can check the list of known artifacts : '''[https://intranet.grid5000.fr/status/artifact/ Grid5000 Artifacts]''' ''(this list is also displayed when you connect on frontends).''
* send email to the [mailto:users@grid5000.fr grid5000 community]
* or report your problem on [https://helpdesk.grid5000.fr/bugzilla/ BugZilla]


= Monitoring tools =
= Resources reservations (OAR) status =
== OAR and OARgrid ==
OAR is the grid scheduler, which may also be queried for current and planned jobs and nodes reservations, either from command lines (see [[OAR|OAR documentation]]) or as web services.


=== Monika ===
{|
[http://oar.imag.fr/ Monika] displays current and scheduled jobs.
|bgcolor="#aaaaaa" colspan="8"|
 
'''Monika''' ''(current placement and queued jobs status)''
You can select an individual site:
|-
{| width="100%"
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
[https://intranet.grid5000.fr/oar/Grenoble/monika.cgi '''Grenoble''']
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
[https://intranet.grid5000.fr/oar/Lille/monika.cgi '''Lille''']
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
[https://intranet.grid5000.fr/oar/Luxembourg/monika.cgi '''Luxembourg''']
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
[https://intranet.grid5000.fr/oar/Lyon/monika.cgi '''Lyon''']
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
[https://intranet.grid5000.fr/oar/Nancy/monika.cgi '''Nancy''']<br>
[https://intranet.grid5000.fr/oar/Nancy/monika-prod.cgi '''Nancy (production)''']
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
[https://intranet.grid5000.fr/oar/Nantes/monika.cgi '''Nantes''']
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
[https://intranet.grid5000.fr/oar/Rennes/monika.cgi '''Rennes''']
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
[https://intranet.grid5000.fr/oar/Sophia/monika.cgi '''Sophia''']
|-
|bgcolor="#aaaaaa" colspan="8"|
'''Drawgantt''' ''(past, current and future OAR jobs scheduling)''
|-
|bgcolor="#eeeeee" colspan="8"|
Default view:
|-
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
<big>'''Grenoble'''</big><br>
[https://intranet.grid5000.fr/oar/Grenoble/drawgantt-svg/ '''nodes''']<br>
[https://intranet.grid5000.fr/oar/Grenoble/drawgantt-svg-disks/ disks]<br>
[https://intranet.grid5000.fr/oar/Grenoble/drawgantt-svg-subnets/ subnets]<br>
[https://intranet.grid5000.fr/oar/Grenoble/drawgantt-svg-vlans/ vlans]
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
<big>'''Lille'''</big><br>
[https://intranet.grid5000.fr/oar/Lille/drawgantt-svg/ '''nodes''']<br>
[https://intranet.grid5000.fr/oar/Lille/drawgantt-svg-disks/ disks]<br>
[https://intranet.grid5000.fr/oar/Lille/drawgantt-svg-subnets/ subnets]<br>
[https://intranet.grid5000.fr/oar/Lille/drawgantt-svg-vlans/ vlans]
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
<big>'''Luxembourg'''</big><br>
[https://intranet.grid5000.fr/oar/Luxembourg/drawgantt-svg/ '''nodes''']<br>
[https://intranet.grid5000.fr/oar/Luxembourg/drawgantt-svg-subnets/ subnets]<br>
[https://intranet.grid5000.fr/oar/Luxembourg/drawgantt-svg-vlans/ vlans]
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
<big>'''Lyon'''</big><br>
[https://intranet.grid5000.fr/oar/Lyon/drawgantt-svg/ '''nodes''']<br>
[https://intranet.grid5000.fr/oar/Lyon/drawgantt-svg-disks/ disks]<br>
[https://intranet.grid5000.fr/oar/Lyon/drawgantt-svg-subnets/ subnets]<br>
[https://intranet.grid5000.fr/oar/Lyon/drawgantt-svg-vlans/ vlans]
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
<big>'''Nancy'''</big><br>
[https://intranet.grid5000.fr/oar/Nancy/drawgantt-svg/ '''nodes''']<br>
[https://intranet.grid5000.fr/oar/Nancy/drawgantt-svg-prod/ '''nodes (production)''']<br>
[https://intranet.grid5000.fr/oar/Nancy/drawgantt-svg-disks/ disks]<br>
[https://intranet.grid5000.fr/oar/Nancy/drawgantt-svg-subnets/ subnets]<br>
[https://intranet.grid5000.fr/oar/Nancy/drawgantt-svg-vlans/ vlans]
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
<big>'''Nantes'''</big><br>
[https://intranet.grid5000.fr/oar/Nantes/drawgantt-svg/ '''nodes''']<br>
[https://intranet.grid5000.fr/oar/Nantes/drawgantt-svg-subnets/ subnets]<br>
[https://intranet.grid5000.fr/oar/Nantes/drawgantt-svg-vlans/ vlans]
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
<big>'''Rennes'''</big><br>
[https://intranet.grid5000.fr/oar/Rennes/drawgantt-svg/ '''nodes''']<br>
[https://intranet.grid5000.fr/oar/Rennes/drawgantt-svg-disks/ disks]<br>
[https://intranet.grid5000.fr/oar/Rennes/drawgantt-svg-subnets/ subnets]<br>
[https://intranet.grid5000.fr/oar/Rennes/drawgantt-svg-vlans/ vlans]
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
<big>'''Sophia'''</big><br>
[https://intranet.grid5000.fr/oar/Sophia/drawgantt-svg/ '''nodes''']<br>
[https://intranet.grid5000.fr/oar/Sophia/drawgantt-svg-subnets/ subnets]<br>
[https://intranet.grid5000.fr/oar/Sophia/drawgantt-svg-vlans/ vlans]
|-
|-
| valign="top" width="33%"|
|bgcolor="#eeeeee" colspan="8"|
* [https://helpdesk.grid5000.fr/oar/Bordeaux/monika.cgi Bordeaux]
Forecast view for 1 week:
* [https://frontal38.imag.fr/cgi-bin/monika.cgi Grenoble]
* [https://helpdesk.grid5000.fr/oar/Lille/monika.cgi Lille]
| valign="top" width="33%"|
* [https://helpdesk.grid5000.fr/oar/Lyon/monika.cgi Lyon]
* [https://helpdesk.grid5000.fr/oar/Nancy/monika.cgi Nancy]
* [https://helpdesk.grid5000.fr/oar/Orsay/monika.cgi Orsay]
| valign="top" width="33%"|
* [https://helpdesk.grid5000.fr/oar/Rennes/monika.cgi Rennes]
* [https://helpdesk.grid5000.fr/oar/Sophia/monika.cgi Sophia]
* [https://helpdesk.grid5000.fr/oar/Toulouse/monika.cgi Toulouse]
|}
 
Or view the [https://frontal38.imag.fr/cgi-bin/oargridmonika.cgi entire grid snapshot].
 
=== DrawOARGantt ===
[http://oar.imag.fr/ DrawOARGantt] displays past, current and scheduled jobs.
 
You can select an individual site:
{| width="100%"
|-
|-
| valign="top" width="33%"|
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
* [https://helpdesk.grid5000.fr/oar/Bordeaux/DrawOARGantt.pl Bordeaux]
<big>'''Grenoble'''</big><br>
* [https://frontal38.imag.fr/cgi-bin/DrawOARGantt.pl Grenoble]
[https://intranet.grid5000.fr/oar/Grenoble/drawgantt-svg/?relative_start=-28800&relative_stop=604800 '''nodes''']<br>
* [https://helpdesk.grid5000.fr/oar/Lille/DrawOARGantt.pl Lille]
[https://intranet.grid5000.fr/oar/Grenoble/drawgantt-svg-disks/?relative_start=-28800&relative_stop=604800 disks]<br>
| valign="top" width="33%"|
[https://intranet.grid5000.fr/oar/Grenoble/drawgantt-svg-subnets/?relative_start=-28800&relative_stop=604800 subnets]<br>
* [https://helpdesk.grid5000.fr/oar/Lyon/DrawOARGantt.pl Lyon]
[https://intranet.grid5000.fr/oar/Grenoble/drawgantt-svg-vlans/?relative_start=-28800&relative_stop=604800 vlans]
* [https://helpdesk.grid5000.fr/oar/Nancy/DrawOARGantt.pl Nancy]
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
* [https://helpdesk.grid5000.fr/oar/Orsay/DrawOARGantt.pl Orsay]
<big>'''Lille'''</big><br>
| valign="top" width="33%"|
[https://intranet.grid5000.fr/oar/Lille/drawgantt-svg/?relative_start=-28800&relative_stop=604800 '''nodes''']<br>
* [https://helpdesk.grid5000.fr/oar/Rennes/DrawOARGantt.pl Rennes]
[https://intranet.grid5000.fr/oar/Lille/drawgantt-svg-disks/?relative_start=-28800&relative_stop=604800 disks]<br>
* [https://helpdesk.grid5000.fr/oar/Sophia/DrawOARGantt.pl Sophia]
[https://intranet.grid5000.fr/oar/Lille/drawgantt-svg-subnets/?relative_start=-28800&relative_stop=604800 subnets]<br>
* [https://helpdesk.grid5000.fr/oar/Toulouse/DrawOARGantt.pl Toulouse]
[https://intranet.grid5000.fr/oar/Lille/drawgantt-svg-vlans/?relative_start=-28800&relative_stop=604800 vlans]
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
<big>'''Luxembourg'''</big><br>
[https://intranet.grid5000.fr/oar/Luxembourg/drawgantt-svg/?relative_start=-28800&relative_stop=604800 '''nodes''']<br>
[https://intranet.grid5000.fr/oar/Luxembourg/drawgantt-svg-subnets/?relative_start=-28800&relative_stop=604800 subnets]<br>
[https://intranet.grid5000.fr/oar/Luxembourg/drawgantt-svg-vlans/?relative_start=-28800&relative_stop=604800 vlans]
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
<big>'''Lyon'''</big><br>
[https://intranet.grid5000.fr/oar/Lyon/drawgantt-svg/?relative_start=-28800&relative_stop=604800 '''nodes''']<br>
[https://intranet.grid5000.fr/oar/Lyon/drawgantt-svg-disks/?relative_start=-28800&relative_stop=604800 disks]<br>
[https://intranet.grid5000.fr/oar/Lyon/drawgantt-svg-subnets/?relative_start=-28800&relative_stop=604800 subnets]<br>
[https://intranet.grid5000.fr/oar/Lyon/drawgantt-svg-vlans/?relative_start=-28800&relative_stop=604800 vlans]
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
<big>'''Nancy'''</big><br>
[https://intranet.grid5000.fr/oar/Nancy/drawgantt-svg/?relative_start=-28800&relative_stop=604800 '''nodes''']<br>
[https://intranet.grid5000.fr/oar/Nancy/drawgantt-svg-prod/?relative_start=-28800&relative_stop=604800 '''nodes (production)''']<br>
[https://intranet.grid5000.fr/oar/Nancy/drawgantt-svg-disks/?relative_start=-28800&relative_stop=604800 disks]<br>
[https://intranet.grid5000.fr/oar/Nancy/drawgantt-svg-subnets/?relative_start=-28800&relative_stop=604800 subnets]<br>
[https://intranet.grid5000.fr/oar/Nancy/drawgantt-svg-vlans/?relative_start=-28800&relative_stop=604800 vlans]
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
<big>'''Nantes'''</big><br>
[https://intranet.grid5000.fr/oar/Nantes/drawgantt-svg/?relative_start=-28800&relative_stop=604800 '''nodes''']<br>
[https://intranet.grid5000.fr/oar/Nantes/drawgantt-svg-subnets/?relative_start=-28800&relative_stop=604800 subnets]<br>
[https://intranet.grid5000.fr/oar/Nantes/drawgantt-svg-vlans/?relative_start=-28800&relative_stop=604800 vlans]
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
<big>'''Rennes'''</big><br>
[https://intranet.grid5000.fr/oar/Rennes/drawgantt-svg/?relative_start=-28800&relative_stop=604800 '''nodes''']<br>
[https://intranet.grid5000.fr/oar/Rennes/drawgantt-svg-disks/?relative_start=-28800&relative_stop=604800 disks]<br>
[https://intranet.grid5000.fr/oar/Rennes/drawgantt-svg-subnets/?relative_start=-28800&relative_stop=604800 subnets]<br>
[https://intranet.grid5000.fr/oar/Rennes/drawgantt-svg-vlans/?relative_start=-28800&relative_stop=604800 vlans]
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
<big>'''Sophia'''</big><br>
[https://intranet.grid5000.fr/oar/Sophia/drawgantt-svg/?relative_start=-28800&relative_stop=604800 '''nodes''']<br>
[https://intranet.grid5000.fr/oar/Sophia/drawgantt-svg-subnets/?relative_start=-28800&relative_stop=604800 subnets]<br>
[https://intranet.grid5000.fr/oar/Sophia/drawgantt-svg-vlans/?relative_start=-28800&relative_stop=604800 vlans]
|}
|}


Or view the [https://frontal38.imag.fr/cgi-bin/DrawGridGantt.cgi entire grid Gantt Diagram].
= Network Monitoring =
== Backbone network status and load ==
[http://pasillo.renater.fr/weathermap/weathermap_g5k.html Grid'5000 Weathermap] (courtesy of Renater)


== GridPrems ==
Shows the actual state of the opticals links between the Grid'5000 10Gb-ready sites. A link painted in black on the weathermap means that you won't be able to access this site nodes from the Grid'5000 internal network.
[http://gforge.inria.fr/projects/gridprems GridPrems] is an alternate grid scheduler running on certain sites.


https://helpdesk.grid5000.fr/gridprems/
== Sites network traffic ==


== Ganglia ==
A dashboard combining links and real-time data is available on the [https://intranet.grid5000.fr/net/Lille/ Grid'5000 Backbone Network Monitoring] page.
[http://ganglia.sourceforge.net/ Ganglia] provides resources usage metrics (memory, cpu, jobs...) for individual sites or the whole grid.


https://helpdesk.grid5000.fr/ganglia/
= Power Monitoring =


== Nagios ==
{|
[http://www.nagios.org/ Nagios] monitors critical grid servers and services and automatically reports incidents and failures.
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
[https://intranet.grid5000.fr/supervision/grenoble/monitoring/energy/last/minute/ '''Grenoble''']
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
[https://intranet.grid5000.fr/supervision/lyon/monitoring/energy/last/minute/ '''Lyon''']
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
[https://intranet.grid5000.fr/supervision/nancy/monitoring/energy/last/minute/ '''Nancy''']
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
[https://intranet.grid5000.fr/supervision/rennes/monitoring/energy/last/minute/ '''Rennes''']
|}


https://helpdesk.grid5000.fr/nagios/
Clusters where kwapi is available are listed on this page : https://intranet.grid5000.fr/jenkins-status/?job=test_kwapi


== phpLDAPadmin ==
= Usage statistics =
[http://phpldapadmin.sourceforge.net/ phpLDAPadmin] provides easy administration for your LDAP account entries.
[https://intranet.grid5000.fr/stats/ Stats5k] gathers a lot of statistics about the testbed.


https://helpdesk.grid5000.fr/phpldapadmin/
= Ganglia =
[https://intranet.grid5000.fr/ganglia/ Ganglia] provides resources usage metrics (memory, cpu, jobs...) for individual sites or the whole platform.


== Current status map ==
= Jenkins =
This tool positions geographically sites and displays their current status.
[https://intranet.grid5000.fr/jenkins-status/ Jenkins] tests most of Grid'5000 services. The web interface provides a summary of results, indicating platform health. Detailed logs are not normally available to users, but access can be requested if needed.


http://www.lri.fr/~herault/G5K/action.html
{|
 
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
{{Warning|text=This tool is experimental for the moment (eg. unstable).}}
[https://intranet.grid5000.fr/jenkins-status/?site=grenoble '''Grenoble''']
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
[https://intranet.grid5000.fr/jenkins-status/?site=lille '''Lille''']
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
[https://intranet.grid5000.fr/jenkins-status/?site=luxembourg '''Luxembourg''']
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
[https://intranet.grid5000.fr/jenkins-status/?site=lyon '''Lyon''']
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
[https://intranet.grid5000.fr/jenkins-status/?site=nancy '''Nancy''']
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
[https://intranet.grid5000.fr/jenkins-status/?site=nantes '''Nantes''']
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
[https://intranet.grid5000.fr/jenkins-status/?site=rennes '''Rennes''']
|bgcolor="#ffffff" valign="top" style="border:1px solid #cccccc;padding:1em;padding-top:0.5em;"|
[https://intranet.grid5000.fr/jenkins-status/?site=sophia '''Sophia''']
|}
''<small>Last generated from the Grid'5000 Reference API on 2019-11-27 ([https://github.com/grid5000/reference-repository/commit/fa8f5c834 commit fa8f5c834])</small>''

Revision as of 14:24, 27 November 2019


Current platform events (maintenance, outages, issues...)

If you experience problems, please check the platform's operation schedule (Past, present and future incidents (planned or not...) are notified for all sites).


For other long running minor issue that may affect your experiment, you can check the list of known artifacts : Grid5000 Artifacts (this list is also displayed when you connect on frontends).

Resources reservations (OAR) status

Monika (current placement and queued jobs status)

Grenoble

Lille

Luxembourg

Lyon

Nancy
Nancy (production)

Nantes

Rennes

Sophia

Drawgantt (past, current and future OAR jobs scheduling)

Default view:

Grenoble
nodes
disks
subnets
vlans

Lille
nodes
disks
subnets
vlans

Luxembourg
nodes
subnets
vlans

Lyon
nodes
disks
subnets
vlans

Nancy
nodes
nodes (production)
disks
subnets
vlans

Nantes
nodes
subnets
vlans

Rennes
nodes
disks
subnets
vlans

Sophia
nodes
subnets
vlans

Forecast view for 1 week:

Grenoble
nodes
disks
subnets
vlans

Lille
nodes
disks
subnets
vlans

Luxembourg
nodes
subnets
vlans

Lyon
nodes
disks
subnets
vlans

Nancy
nodes
nodes (production)
disks
subnets
vlans

Nantes
nodes
subnets
vlans

Rennes
nodes
disks
subnets
vlans

Sophia
nodes
subnets
vlans

Network Monitoring

Backbone network status and load

Grid'5000 Weathermap (courtesy of Renater)

Shows the actual state of the opticals links between the Grid'5000 10Gb-ready sites. A link painted in black on the weathermap means that you won't be able to access this site nodes from the Grid'5000 internal network.

Sites network traffic

A dashboard combining links and real-time data is available on the Grid'5000 Backbone Network Monitoring page.

Power Monitoring

Grenoble

Lyon

Nancy

Rennes

Clusters where kwapi is available are listed on this page : https://intranet.grid5000.fr/jenkins-status/?job=test_kwapi

Usage statistics

Stats5k gathers a lot of statistics about the testbed.

Ganglia

Ganglia provides resources usage metrics (memory, cpu, jobs...) for individual sites or the whole platform.

Jenkins

Jenkins tests most of Grid'5000 services. The web interface provides a summary of results, indicating platform health. Detailed logs are not normally available to users, but access can be requested if needed.

Grenoble

Lille

Luxembourg

Lyon

Nancy

Nantes

Rennes

Sophia

Last generated from the Grid'5000 Reference API on 2019-11-27 (commit fa8f5c834)