Traceable performance evaluation: Difference between revisions
Line 30: | Line 30: | ||
; [https://gitlab.inria.fr/grid5000/environments-recipes/ Environment recipes]: Changes in the way the operating system of nodes is built are tracked in the [https://gitlab.inria.fr/grid5000/environments-recipes/ environment recipes] git repository. It's a good idea to keep track of the version of the system running for an experiment, for instance by storing the information provided in <code class=file>/etc/grid5000/release</code> for the standard environment or other Grid'5000 supported environment. Older environments used for a previous experiment can be redeployed on nodes to help reproduce results. | ; [https://gitlab.inria.fr/grid5000/environments-recipes/ Environment recipes]: Changes in the way the operating system of nodes is built are tracked in the [https://gitlab.inria.fr/grid5000/environments-recipes/ environment recipes] git repository. It's a good idea to keep track of the version of the system running for an experiment, for instance by storing the information provided in <code class=file>/etc/grid5000/release</code> for the standard environment or other Grid'5000 supported environment. Older environments used for a previous experiment can be redeployed on nodes to help reproduce results. | ||
; Reference repository & | ; Reference repository & Reference API: Detailed information and related changes in the components of the infrastructure are recorded in the reference repository which is served by the [[API_tutorial#Reference_API|Grid'5000 Reference API (REST API)]]. Updates of firmware versions for instance are tracked there. It is a good idea to keep track of the state of the platform at the time of the experiment as explained in [[API_tutorial#Platform_state_and_reproducibility]]. | ||
= About the Grid'5000 metering probes = | = About the Grid'5000 metering probes = | ||
Grid'5000 provides the [[Monitoring_Using_Kwollect|kwollect]] service that can collect information about nodes during a job: energy, network traffic, and more. See the [[Monitoring_Using_Kwollect|kwollect]] page or the [[Energy consumption monitoring tutorial]]. | Grid'5000 provides the [[Monitoring_Using_Kwollect|kwollect]] service that can collect information about nodes during a job: energy, network traffic, and more. See the [[Monitoring_Using_Kwollect|kwollect]] page or the [[Energy consumption monitoring tutorial]]. |
Revision as of 15:57, 17 September 2021
This page presents good practices when performing performance evaluations (benchmarks, ...) using Grid'5000 nodes. It is not about benchmarking Grid'5000 itself.
About the Grid'5000 nodes' operating system
Grid'5000 nodes can be used with a choice of different system environments: either with the default (standard) environment which is pre-provisioned on nodes, or with a variety of environments deployed on demand by the user. We compare the several options.
Using a node running the default (standard) environment
In this case, access to the system of the node does not require a deployment (kadeploy), so access is quicker. The system however comes with a lot of functionalities that may impact benchmarks for instance. It may be relevant to uninstall/deactivate them before running a performance evaluation.
The list of services running on the system can be shown using systemctl
Services can be uninstalled (e.g. with apt-get remove
package
) or just deactivated (with systemctl stop
service
) after becoming root (with sudo-g5k
).
The prometheus-node-exporter.service
service is typically known to cause power consumption spikes every 15s.
Also, the Grid'5000 default environment includes some common system tweaks that may have biases on benchmarks. The CPU parameters page provides relevant information.
Using a deployed environment
Using kadeploy is very relevant in the context of performance evaluation, as it allows a user to operate the nodes with the most minimalistic system that is sufficient for the use case. For instance, one can deploy the -min
variant of one of the Grid'5000 supported environments, or a customized environment possibly built on top of a minimalistic environment and featuring only software required for running a benchmark.
Note | |
---|---|
By design, the |
The recipes from which Grid'5000 environment are built are available in https://gitlab.inria.fr/grid5000/environments-recipes. See Environment creation for more information about how environments are built.
About the traceability of the infrastructure
Special effort is made in Grid'5000 to keep track of infrastructure changes so that any user can get information about for instance what side effects could have impacted an experiment.
The following sources of information are available:
- News
- The most notable changes in the infrastructure are the subject of news published on the News page as well as on the Grid'5000 twitter and sent to the users' mailing list (users@list.grid5000.fr). It is strongly recommended to follow the news.
- Environment recipes
- Changes in the way the operating system of nodes is built are tracked in the environment recipes git repository. It's a good idea to keep track of the version of the system running for an experiment, for instance by storing the information provided in
/etc/grid5000/release
for the standard environment or other Grid'5000 supported environment. Older environments used for a previous experiment can be redeployed on nodes to help reproduce results.
- Reference repository & Reference API
- Detailed information and related changes in the components of the infrastructure are recorded in the reference repository which is served by the Grid'5000 Reference API (REST API). Updates of firmware versions for instance are tracked there. It is a good idea to keep track of the state of the platform at the time of the experiment as explained in API_tutorial#Platform_state_and_reproducibility.
About the Grid'5000 metering probes
Grid'5000 provides the kwollect service that can collect information about nodes during a job: energy, network traffic, and more. See the kwollect page or the Energy consumption monitoring tutorial.