Conda: Difference between revisions

From Grid5000
Jump to navigation Jump to search
No edit summary
 
(136 intermediate revisions by 3 users not shown)
Line 1: Line 1:
{{Author|Laurent Mirtain}}
{{Maintainer|Laurent Mirtain}}
{{Status|In production}}
{{Portal|User}}
{{Portal|User}}
{{Portal|Tutorial}}
{{Pages|HPC}}
{{Portal|HPC}}
{{TutorialHeader}}
{{TutorialHeader}}
{{Note|text='''This document was written by consolidating the following different information resources:'''
* Grid'5000 documentation Environment modules, HPC and HTC tutorial, Deep Learning Frameworks
* An [[User:Ibada/Tuto Deep Learning|in-depth tutorial|Tuto Deep Learning tutorial]], Ismael Bada
* [[User:Bjonglez/Debian11/Deep Learning Frameworks|Deep Learning Frameworks tutorial]], Benjamin Jonglez
}}




= Introduction =
= Introduction =


Conda is an open source package management system and environment management system for installing multiple versions of software packages and their dependencies and switching easily between them. It works on Linux, OS X and Windows, and was created for Python programs but can package and distribute any software.
[https://docs.conda.io/projects/conda/en/latest/index.html Conda] is an open source package management system and environment management system for installing multiple versions of software packages and their dependencies and switching easily between them. It works on Linux, OS X and Windows, and was created for Python programs but can package and distribute any software.


The conda package and environment manager is included in all versions of Anaconda®, Miniconda, and Anaconda Repository. Conda is also available on conda-forge, a community channel.  
To get started with Conda, have a look at this [https://docs.conda.io/projects/conda/en/latest/user-guide/cheatsheet.html Conda cheat sheet] and this [https://towardsdatascience.com/managing-project-specific-environments-with-conda-b8b50aa8be0e Getting Started with Conda] guide.


== Anaconda or Miniconda? ==
== Conda, Miniconda, Anaconda ? ==


'''Anaconda''' contains a full distribution of packages while '''Miniconda''' is a condensed version that contains the essentials for standard purposes.
* '''conda''' is the package manager.
* '''miniconda''' is a minimal python distribution for '''conda''' that includes base packages
* '''anaconda''' is another python distribution for '''conda''' that includes 160+ additionnal packages to miniconda


== References ==
On Grid'5000, we installed ''conda'' using the ''miniconda'' installer, but you are free to create an anaconda environment, using the ''anaconda'' meta-package.


* [https://docs.conda.io/projects/conda/en/latest/index.html Conda website]
More information about Miniconda vs Anaconda is available on the [https://docs.conda.io/projects/conda/en/latest/user-guide/install/download.html#anaconda-or-miniconda Conda website].
* [https://docs.conda.io/projects/conda/en/latest/user-guide/cheatsheet.html Conda cheat sheet]
* [https://docs.conda.io/projects/conda/en/latest/user-guide/install/download.html#anaconda-or-miniconda Guide of conda Web Site]


= Conda usage =
== Conda or Mamba? ==


== Conda environments ==
[https://mamba.readthedocs.io/en/latest/index.html mamba] is a reimplementation of the conda package manager in C++. Conda has a reputation for taking time when dealing with complex sets of dependencies. Mamba is much more efficient and is fully compatible with Conda packages and supports most of Conda's commands. It consists of:
 
* mamba: a Python-based CLI conceived as a drop-in replacement for conda, offering higher speed and more reliable environment solutions
Conda allows you to create separate environments containing files, packages, and their dependencies that will not interact with other environments.
* micromamba: a pure C++-based CLI, self-contained in a single-file executable
 
* libmamba: a C++ library exposing low-level and high-level APIs on top of which both mamba and micromamba are built
When you begin using conda, you already have a default environment named "base".  
You can create separate environments to keep your programs isolated from each other. Specifying the environment name confines conda commands to that environment.
 
* List all your environments
{{Term|location=$|cmd=<code class="command">conda info --envs</code>}}
or
{{Term|location=$|cmd=<code class="command">conda env list</code>}}
 
* Create a new environment
{{Term|location=$|cmd=<code class="command">conda create --name ENVNAME</code>}}
 
* Activate this environment before installing package
{{Term|location=$|cmd=<code class="command">conda activate ENVNAME</code>}}
 
For further information:
* https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html
* [https://towardsdatascience.com/managing-project-specific-environments-with-conda-406365a539ab Managing your data science project environments with Conda]
 
== Conda package installation ==


In its default configuration, Conda can install and manage the over 7,500 packages at https://repo.anaconda.com/pkgs/ that are built, reviewed, and maintained by Anaconda. This is the default Conda channel which may require a paid license, as described in the repository terms of service a commercial license.
=== Mamba on Grid'5000 ===


{{Term|location=$|cmd=<code class="command">conda install <package></code>}}
Like Conda, Mamba is available as a module on Grid'5000:
{{Term|location=frontal|cmd=<code class="command">module load mamba</code>}}


== Conda package installation from channels ==
Then, since its syntax is generally compatible with Conda, you can use the <code class="command">mamba</code> command where you would use the <code class="command">conda</code> command.


Channels are the locations of the repositories where Conda looks for packages. Channels may point to a Cloud repository or a private location on a remote or local repository that you or your organization created. Useful channels are:
= Conda on Grid'5000 =
* ''conda-forge'' from https://conda-forge.org. It is free for all to use.
* ''nvidia'' from https://anaconda.org/nvidia. It provides Nvidia's software.


To install a package from a specific channel:
Conda is already available in Grid'5000 as a module. '''You do not need to install Anaconda or Miniconda on Grid'5000 !'''
{{Term|location=$|cmd=<code class="command">conda install -c <chanel_name> <package></code>}}


* List all packages installed with their source channels
== Load Conda module ==
{{Term|location=$|cmd=<code class="command">conda list --show-channel-urls</code>}}


For more information:
* To make it available on a node or on a frontend, load the Conda module as follows (default version):
* https://docs.conda.io/projects/conda/en/latest/user-guide/concepts/packages.html
{{Term|location=frontal|cmd=<code class="command">module load conda</code>}}
* https://docs.conda.io/projects/conda/en/latest/user-guide/concepts/channels.html


== Conda shell activation ==
== Optional: Conda initialization and activation in your shell ==


Conda shell activation is the process of defining some shell functions that facilitate activating and deactivating Conda environments, as well as some optional features such as updating PS1 to show the active environment.
Conda initialization is the process of defining some shell functions that facilitate activating and deactivating Conda environments, as well as some optional features such as updating PS1 to show the active environment. It is not required to use Conda.


The conda shell function is mainly a forwarder function. It will delegate most of the commands to the real conda executable driven by the Python library.
The conda shell function is mainly a forwarder function. It will delegate most of the commands to the real conda executable driven by the Python library.


Activate your Conda shell environment as follow:
There are two ways to initialize conda in standard installation:
 
* 1. occasionally : activate conda in your current shell (ex: bash)
{{Term|location=$|cmd=<code class="command">eval "$(conda shell.bash hook)"</code>}}  
{{Term|location=$|cmd=<code class="command">eval "$(conda shell.bash hook)"</code>}}  


By defaut, you are located in the <code>base</code> Conda environment that correspond to the base installation of Conda.
* 2. always : activate conda in your login shell environment permanently (this command modifies your .bashrc by adding conda setup directives)
{{Term|location=$|cmd=<code class="command">conda init</code>}}


== Suggested reading ==
{{Warning|text=bash is the default shell for conda.<br>
* [https://docs.conda.io/projects/conda/en/latest/user-guide/cheatsheet.html Conda cheat sheet ] for current commands
For users using tcsh or zsh  use :
* [https://towardsdatascience.com/managing-project-specific-environments-with-conda-b8b50aa8be0e Getting Started with Conda]
* <code class="command">eval "$(conda shell.{tcsh,zsh} hook)"</code>
* <code class="command">conda init {tcsh,zsh}</code>}}


= Load conda on Grid'5000 =
In Grid'5000, the '''conda''' initialization is made transparently by loading the conda module.


Conda is already available in Grid'5000 as a module. You don't need to install Anaconda or Miniconda on Grid'5000! To make it available on a node or on a frontend, you need to load the Conda module as follow:
The <code class="command">conda activate</code> or  
<code class="command">conda deactivate</code> commands relies on the conda shell initialization to load/unload the corresponding conda environment variables to the current shell session.


* For Miniconda:
By default, you are located in the <code>base</code> Conda environment that corresponds to the base installation of Conda.


{{Term|location=inside|cmd=<code class="command">module load miniconda3</code>}}
If you’d prefer that conda’s base environment not be activated on startup, set the auto_activate_base parameter to false:
{{Term|location=inside|cmd=<code class="command">conda --version</code>}}
{{Term|location=$|cmd=<code class="command">conda config --set auto_activate_base false</code>}}
conda 4.10.3


* For Anaconda:
Verify your conda configuration with this command:
{{Term|location=$|cmd=<code class="command">conda config --show</code>}}


{{Term|location=inside|cmd=<code class="command">module load anaconda3</code>}}
Look at all available configuration options with:
{{Term|location=inside|cmd=<code class="command">conda --version</code>}}
{{Term|location=$|cmd=<code class="command">conda config --describe</code>}}
conda 4.12.0


= Create conda environments on Grid'5000 =
== Conda environments ==


== Basic Conda workflow ==
Conda allows you to create separate environments containing files, packages, and their dependencies that will not interact with other environments.


{{Warning|text=Installing Conda packages can be time and resource consuming. Preferably use a node (instead of a frontend) to perform such an operation. Note, using a node is mandatory if you need to access specific hardware resources like GPU.}}
When you begin using conda, you already have a default environment named <code>base</code>.
You can create separate environments to keep your programs isolated from each other. Specifying the environment name confines conda commands to that environment.


* Load conda module and activate bash completion
{{Warning|text=The <code>base</code> environment is stored in a read-only directory as shown by <code>conda info</code> command
'''That's why you need to systematically create your own conda environments to install the software you need.'''}}


{{Term|location=fgrenoble|cmd=<code class="command">module load miniconda3</code>}}
* List all your environments
{{Term|location=fgrenoble|cmd=<code class="command">eval "$(conda shell.bash hook)"</code>}}
{{Term|location=$|cmd=<code class="command">conda info --envs</code>}}
or
{{Term|location=$|cmd=<code class="command">conda env list</code>}}


* Create an environment (specify a Python version; otherwise, it is the module default version)
* Create a new environment
{{Term|location=fgrenoble|cmd=<code class="command">conda create -y -n <name> python=x.y</code>}}  
{{Term|location=$|cmd=<code class="command">conda create --name ENVNAME</code>}}


* Load this environment
* Activate this environment before installing package
{{Term|location=fgrenoble|cmd=<code class="command">conda activate <name></code>}}
{{Term|location=$|cmd=<code class="command">conda activate ENVNAME</code>}}


* Install a package
For further information:
{{Term|location=fgrenoble|cmd=<code class="command">conda install <package_name></code>}}
* https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html
* [https://towardsdatascience.com/managing-project-specific-environments-with-conda-406365a539ab Managing your data science project environments with Conda]


* Exit from the loaded environment
== Conda package installation ==
{{Term|location=fgrenoble|cmd=<code class="command">conda deactivate</code>}}


== Remove unused environments ==
In its default configuration (the default Conda channel), Conda can install and manage the over 7,500 packages at https://repo.anaconda.com/pkgs/ that are built, reviewed, and maintained by Anaconda.


{{Warning|text=Conda packages are installed in <code>$HOME/.conda</code>. You could, therefore, rapidly saturate your [[Storage#.2Fhome|homedir quota]] (25GB by default). Do not forget to occasionally remove unused Conda environment to free up space.}}
{{Term|location=$|cmd=<code class="command">conda install <package></code>}}


* To delete an environment
* Install specific version of package:
{{Term|location=fgrenoble|cmd=<code class="command">conda env remove --name <name></code>}}


* To remove unused packages and the cache. Do not be concerned if this appears to try to delete the packages of the system environment (ie. non-local).
{{Term|location=$|cmd=<code class="command">conda install <package>=<version></code>}}
{{Term|location=fgrenoble|cmd=<code class="command">conda clean -a</code>}}


== Install AI librairies ==
* Uninstall a package:


=== Use NVIDIA tools ===
{{Term|location=$|cmd=<code class="command">conda uninstall <package></code>}}


NVIDIA libraries are available via Conda. It gives you the possibility to manage project specific versions of the NVIDIA CUDA Toolkit, NCCL, and cuDNN. NVIDIA actually maintains their own Conda channel. The versions of CUDA Toolkit available from the default channels are the same as those you will find on the NVIDIA channel.
For further information:
* https://docs.conda.io/projects/conda/en/latest/user-guide/concepts/packages.html


* To compare build numbers version from default and nvidia channel
== Conda package installation from channels ==
{{Term|location=inside|cmd=<code class="command">conda search --channel nvidia cudatoolkit</code>}}


See:
Channels are the locations of the repositories where Conda looks for packages. Channels may point to a Cloud repository or a private location on a remote or local repository that you or your organization created. Useful channels are:
* [https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#conda-installation Nvidia doc: Installing CUDA Using Conda]
* <code>conda-forge</code> from https://conda-forge.org. It is free for all to use.  
* [https://towardsdatascience.com/managing-cuda-dependencies-with-conda-89c5d817e7e1 “Best practices” Managing CUDA dependencies with Conda ]
* <code>nvidia</code> from https://anaconda.org/nvidia. It provides Nvidia's software.


==== Cudatoolkit ====
To install a package from a specific channel:
{{Term|location=$|cmd=<code class="command">conda install -c <channel_name> <package></code>}}


* Install ''cudatoolkit'' from '''nvidia''' channel.
* List all packages installed with their source channels
{{Term|location=$|cmd=<code class="command">conda list --show-channel-urls</code>}}


{{Term|location=inside|cmd=<code class="command">conda install cudatoolkit -c nvidia</code>}}
For further information:
* https://docs.conda.io/projects/conda/en/latest/user-guide/concepts/channels.html


Note: do not forget to create a dedicated environment before.
{{Warning|text=Installing Conda packages can be time and resources consuming. Preferably use a node (instead of a frontend) to perform such an operation. Note, using a node is mandatory if you need to access specific hardware resources like GPU.}}


==== Cuda ====
= Application examples =


''cuda'' is available in both '''conda-forge''' or '''nvidia''' channels.
== Create an environment ==
* Install ''cuda'' from '''nvidia''' channel:


{{Term|location=inside|cmd=<code class="command">conda install cuda -c nvidia</code>}}
For example create environment <code class="replace"><env_name></code> (specify a Python version; otherwise, it is the module default version)
{{Term|location=fgrenoble|cmd=<code class="command">conda create -y -n </code><code class="replace"><env_name></code> <code class="command">python=x.y</code>}}


Note: do not forget to create a dedicated environment before.
== Load this environment ==
{{Term|location=fgrenoble|cmd=<code class="command">conda activate </code><code class="replace"><env_name></code>}}


* Installing Previous CUDA Releases
== Install a package into ==
{{Term|location=fgrenoble|cmd=<code class="command">conda install </code><code class="replace"><package_name></code>}}


All Conda packages released under a specific CUDA version are labeled with that release version. To install a previous version, include that label in the install command to ensure that all cuda dependencies come from the wanted CUDA version. For instance, if you want to install cuda 11.3.0:
== Exit from the loaded environment ==
{{Term|location=fgrenoble|cmd=<code class="command">conda deactivate</code>}}


{{Term|location=inside|cmd=<code class="command">conda install cuda -c nvidia/label/cuda-11.3.0</code>}}
== Remove unused Conda environments ==


* To display the version of Nvidia cuda compiler installed:
{{Warning|text=Conda packages are installed in <code>$HOME/.conda</code>. You could, therefore, rapidly saturate your [[Storage#.2Fhome|homedir quota]] (25GB by default). Do not forget to occasionally remove unused Conda environment to free up space.}}
{{Term|location=inside|cmd=<code class="command">nvcc --version</code>}}


=== PyTorch ===
* To delete an environment
{{Term|location=fgrenoble|cmd=<code class="command">conda deactivate</code><br>
<code class="command">conda env remove --name </code><code class="replace"><env_name></code>}}


PyTorch is a software library specialized in machine learning techniques. It can automatically detect GPU availability at run-time.
* To remove unused packages and the cache. Do not be concerned if this appears to try to delete the packages of the system environment (ie. non-local).
{{Term|location=fgrenoble|cmd=<code class="command">conda clean -a</code>}}


==== Installation ====
== Use a Conda environment in a job ==


* Simple installation PyTorch from nvidia channel
As seen in the previous section, the Conda environment is stored by default in user's homedir (at <code>~/.conda</code>). Once the environment is created and packages installed, it is usable on all nodes from the given site.
{{Term|location=flille|cmd=<code class="command">conda install pytorch -c nvidia</code>}}
Note: do not forget to create a dedicated environment before.  


* For a full installation, you might want to combine Pytorch Stable (e.g., 1.13.1) with Python language and specific Cuda version (e.g., 11.6). This can be done by:
=== For interactive jobs ===
{{Term|location=flille|cmd=<code class="command">conda install pytorch==1.13.1 pytorch-cuda=11.6 -c pytorch -c nvidia</code>}}


==== Verify your installation ====
Load, init, and active you conda environment <code class="replace">env_name</code> in an interactive job


* Check which Python binary is used:
{{Term|location=frontal|cmd=<code class="command">oarsub -I</code>}}
{{Term|location=node|cmd=<code class="command">module load conda</code><br>
<code class="command">conda activate </code><code class="replace">env_name</code>}}


{{Term|location=flille|cmd=<code class="command">which python</code>}}
=== For batch jobs ===
<code>/home/</code><code class="replace">login</code><code>/.conda/envs/</code><code class="replace">env_name</code><code>/bin/python</code>


* Construct a randomly initialized tensor.
Load, initialize, and active you conda environment <code class="replace">env_name</code> in a batch job
{{Term|location=flille|cmd=<code class="command">python</code>}}
<pre>
>>> import torch
>>> x = torch.rand(5, 3)
>>> print(x)
tensor([[0.3485, 0.6268, 0.8004],
        [0.3265, 0.9763, 0.5085],
        [0.6087, 0.6940, 0.8929],
        [0.2143, 0.6307, 0.5182],
        [0.0076, 0.6455, 0.5223]])
</pre>


* Print the Cuda version
First prepare your conda environment on the frontend:
{{Term|location=flille|cmd=<code class="command">python</code>}}
* module load and conda initialization
<pre>
* conda creation of an environment <code>testconda</code> containing <code>gcc</code> from <code>conda-forge</code> channel
>>> import torch
* list installed packages with source info
>>> print("Pytorch CUDA Version is ", torch.version.cuda)
{{Term|location=fsiteA|cmd=<code class="command">module load conda</code><br>
Pytorch CUDA Version is 11.6
<code class="command">conda create --name testconda</code><br>
</pre>
<code class="command">conda activate testconda</code><br>
<code class="command">conda install -c conda-forge gcc_linux-64 gxx_linux-64</code>}}
* launch this commands and keep output result
{{Term|location=fsiteA|cmd=<code class="command">conda info</code><br>
<code class="command">conda list -n testconda --show-channel-urls</code>}}


==== Verify your installation on a GPU node ====
In this example, we launch a job that does the same tasks but in batch job.
* The important step is to source shell environment to execute module and activate conda
{{Term|location=fsiteA|cmd=<code class="command">oarsub 'bash -l -c "module load conda ; conda activate testconda ; conda info ; conda list -n testconda --show-channel-url"'</code>}}
<pre>OAR_JOB_ID=1539228</pre>


* Is job finished ?
{{Term|location=fsiteA|cmd=<code class="command">oarsub -C 1539228</code>}}
<pre># Error: job 1539228 is not running. Its current state is Finishing.</pre>


* Reserve only one GPU (with the associated CPU cores and share of memory) in interactive mode:
* Compare output with the previous one : they should be identical
{{Term|location=flille|cmd=<code class="command">oarsub -l gpu=1 -I</code>}}
{{Term|location=fsiteA|cmd=<code class="command">cat OAR.1539228.std</code>}}
* Load miniconda3 and activate your Pytorch environment
{{Term|location=gpunode|cmd=<code class="command">module load miniconda3; eval "$(conda shell.bash hook)"; conda activate <env_name></code>}}
* Launch python and execute the following code:
{{Term|location=gpunode|cmd=<code class="command">python</code>}}
<pre>
>>> import torch
>>> print("Whether CUDA is supported by our system: ", torch.cuda.is_available())
Whether CUDA is supported by our system:  True
</pre>


* To know the CUDA device ID and name of the device, you can run:
= Advanced Conda environment operations =
{{Term|location=gpunode|cmd=<code class="command">python</code>}}
<pre>
>>> import torch
>>> Cuda_id = torch.cuda.current_device()
>>> print("CUDA Device ID: ", torch.cuda.current_device())
CUDA Device ID:  0
>>> print("Name of the current CUDA Device: ", torch.cuda.get_device_name(Cuda_id))
Name of the current CUDA Device:  GeForce GTX 1080 Ti
</pre>
 
=== Tensorflow ===
 
TensorFlow offers multiple levels of abstraction so you can choose the right one for your needs. Build and train models by using the high-level Keras API, which makes getting started with TensorFlow and machine learning easy.
 
* install the current release of GPU TensorFlow
{{Term|location=inside|cmd=<code class="command">conda create -n TutoConda tensorflow</code>}}
 
* Work on a GPU node : Reserve only one GPU (with the associated CPU cores and share of memory) in interactive mode, run:
{{Term|location=flyon|cmd=<code class="command">oarsub -l gpu=1 -I</code>}}
 
* Load miniconda3 and activate your Tensorflow environment
{{Term|location=gpunode|cmd=<code class="command">module load miniconda3</code>}}
{{Term|location=gpunode|cmd=<code class="command">eval "$(conda shell.bash hook)"</code>}}
{{Term|location=gpunode|cmd=<code class="command">conda activate TutoConda</code>}}
 
* Launch python
{{Term|location=gpunode|cmd=<code class="command">python</code>}}
<pre>
Python 3.7.11 (default, Jul 27 2021, 14:32:16)
[GCC 7.5.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
</pre>
 
* Test the installation
{{Term|location=gpunode|cmd=<code class="command">python -c "import tensorflow as tf; x = [[2.]]; print('tensorflow version', tf.__version__); print('hello, {}'.format(tf.matmul(x, x)))"</code>}}
* It displays
<pre>
tensorflow version 2.0.0-rc1
hello, [[4.]]
</pre>
 
To go further :
* https://docs.anaconda.com/anaconda/user-guide/tasks/tensorflow/
 
 
 
=== Scikit-learn ===
 
As explain on [https://scikit-learn.org/stable/install.html scikit-learn website] you can use conda to install the lastest official release into a conda environment
 
* Installation
{{Term|location=inside|cmd=<code class="command">conda create --name sklearn-env -c conda-forge scikit-learn</code>}}
{{Term|location=inside|cmd=<code class="command">conda activate sklearn-env</code>}}
 
* Check your installation version
{{Term|location=inside|cmd=<code class="command">conda list scikit-learn</code>}}
 
* Test with a script
{{Term|location=inside|cmd=<code class="command">python -c "import sklearn; sklearn.show_versions()"</code>}}
 
* To go further:
** [https://scikit-learn.org/stable/tutorial/basic/tutorial.html scikit-learn.org Tutorials]
**[https://www.dataquest.io/blog/sci-kit-learn-tutorial/ Dataquest Scikit-learn Tutorial]
** [https://www.digitalocean.com/community/tutorials/python-scikit-learn-tutorial Another Python SciKit Learn Tutorial]
 
=== Keras ===
 
{{Note|text= This paragraph is based on this tutorial [[User:Ibada/Tuto Deep Learning]] }}
 
Keras is a high-level neural networks API, written in python, which is used as a wrapper of theano, tensorflow or CNTK.
Keras allows to create deep learning experiments much more easily than using directly theano or tensorflow,
it's the recommended tool for beginners and even advanced users who don't want to deal and spend too much time with the complexity of low-level libraries as theano and tensorflow.
 
 
* Check if you have added anaconda binary path in your $PATH ( see the previous section ), type this command:
{{Term|location=inside|cmd=<code class="command">which python</code>}}
 
* The output must contain the path of conda installation
** something like <code>/grid5000/spack/opt/spack/linux-debianNN-x86_64/gcc-XX.Y.Z/miniconda3-*-***/bin/python</code> inside Grid'5000 (frontal or node)
** something like <code>~/miniconda*/bin</code> on your workstation
 
* If it's ok then you can install keras:
{{Term|location=inside|cmd=<code class="command">conda install keras</code>}}
 
== Install HPC libraries ==
=== GCC ===
 
* Install the latest version of gcc via '''conda-forge''' channel
{{Term|location=inside|cmd=<code class="command">conda install -c conda-forge gcc_linux-64 gxx_linux-64</code>}}
 
=== OpenMPI ===
 
Here's an example of installing Open MPI in a conda environment optimized with ucx to use the cluster's high-bandwidth and low-latency.
 
UCX exposes a set of abstract communication primitives that utilize the best of available hardware resources and offloads. These include RDMA (InfiniBand and RoCE), TCP, GPUs, shared memory, and network atomic operations.
 
We use mpi program on a cluster with OminPath or InfiniBand Network connection and compare performance between two nodes using or not ucx driver.
 
We first choose an appropriate cluster using [https://www.grid5000.fr/w/Hardware Grid'5000 Hardware Documentation] : [https://www.grid5000.fr/w/Grenoble:Hardware#dahu dahu cluster at Grenoble]
* Log into Grenoble grid'5000 frontal
{{Term|location=outside|cmd=<code class="command">ssh</code> <code class="replace">login</code><code class="command">@</code><code class="host">access.grid5000.fr</code>}}
{{Term|location=inside|cmd=<code class="command">ssh grenoble</code>}}
 
* Load conda and source it
{{Term|location=fgrenoble|cmd=<code class="command">module load miniconda3 && eval "$(conda shell.bash hook)"</code>}}
 
* Create an isolated conda environment for MPI
{{Term|location=fgrenoble|cmd=<code class="command">conda create --name mpienv && conda activate mpienv</code>}}
 
* Install openmpi, ucx and gcc package into mpienv environment
{{Term|location=(mpienv)|cmd=<code class="command">conda install -c conda-forge gcc_linux-64 openmpi ucx</code>}}
 
* Install NetPIPE to test latency and network throughput (NetPIPE not available as conda package)
{{Term|location=(mpienv)|cmd=<code class="command">cd $HOME ; mkdir SRC && cd SRC</code>}}
{{Term|location=(mpienv)|cmd=<code class="command">wget https://src.fedoraproject.org/lookaside/pkgs/NetPIPE/NetPIPE-3.7.2.tar.gz/653071f785404bb68f8aaeff89fb1f33/NetPIPE-3.7.2.tar.gz</code>}}
{{Term|location=(mpienv)|cmd=<code class="command">tar zvxf NetPIPE-3.7.2.tar.gz</code>}}
{{Term|location=(mpienv)|cmd=<code class="command">cd NetPIPE-3.7.2/ && make mpi</code>}}
 
* Reserve 2 cores on 2 separate nodes and enter into interactive session :
{{Term|location=(mpienv)|cmd=<code class="command">oarsub -I -p dahu -l /nodes=2/core=1</code>}}
 
* On node dahu-X : Load conda, activate mpienv, modify PATH
{{Term|location=dahu|cmd=<code class="command">module load miniconda3 && eval "$(conda shell.bash hook)"</code>}}
{{Term|location=dahu|cmd=<code class="command">conda activate mpienv</code>}}
{{Term|location=dahu|cmd=<code class="command">export PATH=~/SRC/NetPIPE-3.7.2:$PATH</code>}}
 
* Run MPI without ucx (use standard network):   
{{Term|location=dahu|cmd=<code class="command">mpirun -np 2 --machinefile $OAR_NODEFILE --prefix $CONDA_PREFIX --mca plm_rsh_agent oarsh  NPmpi</code>}}
<pre>
0: dahu-3
1: dahu-30
Now starting the main loop
  0:      1 bytes  6400 times -->      0.54 Mbps in      14.21 usec
  1:      2 bytes  7035 times -->      1.07 Mbps in      14.20 usec
  2:      3 bytes  7043 times -->      1.61 Mbps in      14.22 usec
...
116: 4194304 bytes    12 times -->  8207.76 Mbps in    3898.75 usec
117: 4194307 bytes    12 times -->  8161.45 Mbps in    3920.87 usec
...
</pre>
   
* Run MPI with ucx (use rapid network):
{{Term|location=dahu|cmd=<code class="command">mpirun -np 2 --machinefile $OAR_NODEFILE --prefix $CONDA_PREFIX --mca plm_rsh_agent oarsh --mca pml ucx --mca osc ucx NPmpi</code>}}
<pre>
0: dahu-3
1: dahu-30
Now starting the main loop
  0:      1 bytes  19082 times -->      1.69 Mbps in      4.50 usec
  1:      2 bytes  22201 times -->      3.08 Mbps in      4.95 usec
  2:      3 bytes  20212 times -->      4.46 Mbps in      5.13 usec
...
116: 4194304 bytes    46 times -->  30015.10 Mbps in    1066.13 usec
117: 4194307 bytes    46 times -->  30023.66 Mbps in    1065.83 usec
...
</pre>
 
= Use a Conda environment on Grid'5000 =
 
As seen in the previous section, the Conda environment is stored by default in user's homedir (at <code>~/.conda</code>). Once the environment is created and packages installed, it is usable on all nodes from the given site.
 
== For interactive jobs ==
 
{{Term|location=fgrenoble|cmd=<code class="command">oarsub -I</code>}}
{{Term|location=node|cmd=<code class="command">module load miniconda3</code>}}
{{Term|location=node|cmd=<code class="command">eval "$(conda shell.bash hook)"</code>}}
{{Term|location=node|cmd=<code class="command">conda activate <name></code>}}
 
== For batch jobs ==


{{Warning|text=As ''module'' command is not a real executable but a shell function, it must be executed in an actual shell to work. A simple <code class="command">oarsub "module load miniconda3"</code> will fail.}}
== Synchronize Conda environments between Grid'5000 sites ==


{{Term|location=fgrenoble|cmd=<code class="command">oarsub</code> <code class="command">'bash -l -c "module load miniconda3; conda activate <name>; <your script>"'</code>}}
* To synchronize a Conda directory from a siteA to a siteB:


= Advanced Conda environment operations =
{{Term|location=fsiteA|cmd=<code class="command">rsync --dry-run --delete -avz ~/.conda siteB.grid5000.fr:~</code>}}


== Synchronise Conda environments between Grid'5000 sites ==
To really do things, the <code>--dry-run</code> argument has to be removed and ''siteB'' has to be replaced by a real site name.
 
TODO


== Share Conda environments between multiple users ==
== Share Conda environments between multiple users ==


By default, conda and all packages are installed locally with a user-specific configuration. In the Grid'5000 context, conda installation is already a shared installation because it is installed via module.
You can use two different approaches to share Conda environments with other users.
 
The commande <code>$ conda info</code> shows the emplacement of conda and base environment on a NFS storage:
<pre>
    active environment : base
    active env location : /grid5000/spack/opt/spack/linux-debian11-x86_64/gcc-10.2.0/miniconda3-4.10.3-x6kxdkqihyhysyjs7i4g77wururhgvfg
...
      base environment : /grid5000/spack/opt/spack/linux-debian11-x86_64/gcc-10.2.0/miniconda3-4.10.3-x6kxdkqihyhysyjs7i4g77wururhgvfg  (read only)
...
      envs directories : /home/lmirtain/.conda/envs
                          /grid5000/spack/opt/spack/linux-debian11-x86_64/gcc-10.2.0/miniconda3-4.10.3-x6kxdkqihyhysyjs7i4g77wururhgvfg/envs
...
</pre>
 
Grid'5000 conda directory: <code>/grid5000/spack/opt/spack/linux-debian11-x86_64/gcc-10.2.0/miniconda3-4.10.3-x6kxdkqihyhysyjs7i4g77wururhgvfg</code>)
 
You can make conda environments and any number of packages available to a group of one or more users. We explain here how to do it in the case of a group of Grid'5000 users.
 
You want to make environments available to other users. There is two approach to share conda environments with other users.
 
 


=== Export an environment as a yaml file ===
=== Export an environment as a yaml file ===


* Export it as follow:
* Export it as follows:
<pre>$ conda env export > environment.yml </pre>
{{Term|location=fgrenoble|cmd=<code class="command">conda env export > environment.yml</code>}}


* Share it by putting the yaml file in your public folder
* Share it by putting the yaml file in your public folder
<pre>$ cp environment.yml ~/public/</pre>
{{Term|location=fgrenoble|cmd=<code class="command">cp environment.yml ~/public/</code>}}


* Other users can create the environment from the <code>environment.yml</code> file
* Other users can create the environment from the <code>environment.yml</code> file
<pre>$ conda env create -f ~/<login>/public/environment.yml</pre>
{{Term|location=fgrenoble|cmd=<code class="command">conda env create -f ~/<login>/public/environment.yml</code>}}


* Avantage : it prevents other users from damaging the environment if they add packages that could conflict with other packages and/or even delete packages that another user might need.
* Advantage : it prevents other users from damaging the environment if they add packages that could conflict with other packages and/or even delete packages that another user might need.
* Inconvenient : it's not a true shared environment
* Inconvenient : it's not a true shared environment. The environment is duplicated on other users' home directory. Any modification on one Conda environment will not be automatically replicated on others.


=== create the environment on a shared storage with --prefix option in your create and then instruct your users to add the shared path to their conda config file ===
=== Use a group storage ===


* creation of shared environment
[[Group Storage]] gives you the possibility to share a storage between multiple users. You can take advantage of a group storage to share a single Conda environment among multiple users.
<code>$ conda create --prefix </code><code class="replace">/path/to/share/storage/ENVNAME</code>


* activatation of shared environment
* Create a shared Conda environment with <code>--prefix</code> to specify the path to use to store the conda environment
<code>$ conda activate </code><code class="replace">/path/to/share/storage/ENVNAME</code>
{{Term|location=flyon|cmd=<code class="command">conda create --prefix /srv/storage/</code><code class="replace">storage_name</code>@<code class="replace">server_hostname_(fqdn)/ENVNAME</code>}}


* Activate the shared environment (share this command with the targeted users)
{{Term|location=flyon|cmd=<code class="command">conda activate /srv/storage/</code><code class="replace">storage_name</code>@<code class="replace">server_hostname_(fqdn)/ENVNAME</code>}}


On Grid'5000, the appropriate shared storage is the [https://www.grid5000.fr/w/Group_Storage Groupe Storage] that is associated to the group that wants to share conda environments
* Advantage : It avoids storing duplicate packages and makes any modification accessible to all users
* Inconvenients :
** Users could potentially harm the environment by installing or removing packages.
** When installing additional packages, conda still stores them in the package cache located in your home directory. Use <code class="command">conda clean</code> as described above to clean those files.


For example : replace <code>/path/to/share/storage/</code><code class="replace">ENVNAME</code> with <code>/srv/storage/</code><code class="replace">storage_name</code>@<code class="replace">server_hostname_(fqdn)</code>
* Create your environments by defaut in a group storage location
 
You can modify you <code>~/.condarc</code> file to specify this location for conda environment and package installation as follow (change the location to suit your group and your convenience). Add this lines:
 
<pre>
 
pkgs_dirs:
 
  - /srv/storage/storage_name@server_hostname_(fqdn)/conda_shared_envs/pkgs/
= Create a specific environment for PowerPC arch =
envs_dirs:
 
  - /srv/storage/storage_name@server_hostname_(fqdn)/conda_shared_envs/envs/
* The IBM PowerAI https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/#/ Watson Machine Learning Community Edition
</pre>
 
Because the version of pytorch in PowerAI is too old for py38 or py39, we must use python 3.7.
 
* Create a dedicated environment
{{Term|location=inside|cmd=<code class="command">conda create create --name pytorch-ppc64-py37 python=3.7</code>}}
 
* Load it
{{Term|location=inside|cmd=<code class="command">conda activate pytorch-ppc64-py37</code>}}
 
* Install pytorch in this environment
{{Term|location=inside|cmd=<code class="command">conda config --prepend channels https://public.dhe.ibm.com/ibmdl/export/pub/software/server/ibm-ai/conda/</code>}}
{{Term|location=inside|cmd=<code class="command">conda install pytorch -c</code>}}
 
= Mamba as an alternative to conda =
 
[https://mamba.readthedocs.io/en/latest/index.html mamba] is a reimplementation of the conda package manager in C++.
It consists of:
* mamba: a Python-based CLI conceived as a drop-in replacement for conda, offering higher speed and more reliable environment solutions
* micromamba: a pure C++-based CLI, self-contained in a single-file executable
* libmamba: a C++ library exposing low-level and high-level APIs on top of which both mamba and micromamba are built
mamba is fully compatible with conda packages and supports most of conda’s commands.
 
Mamba is relatively new and unpopular compared to Conda. That means there are probably more undiscovered bugs, and that new bugs may take longer to be discovered. That said, the agility displayed by the Mamba developers makes me think they'd probably fix newly discovered bugs faster.
 
mamba has to be considerate when using a devops chain in order to test and deploy an environment (ie. docker image) with continuous integration pipelines. Conda has a reputation for taking time when dealing with complex sets of dependencies so CI jobs can take longer than they need to.
 
* Mamba installation when already have conda
<pre>$ conda install mamba -n base -c conda-forge</pre>


* Installing packages is similarly easy, example:
= Build your HPC-IA framework with conda =
<pre>$ mamba install python=3.8 jupyter -c conda-forge</pre>


[https://mamba.readthedocs.io/en/latest/installation.html To go further with mamba]
Here are some pointers to help you set up your software environment for HPC or AI with conda
* [[HPC_and_HTC_tutorial]]
* Running [[Run_MPI_On_Grid'5000|MPI applications on Grid'5000]]
* [[Deep_Learning_Frameworks|Deep Learning Frameworks documentation]]

Latest revision as of 16:23, 21 September 2023

Note.png Note

This page is actively maintained by the Grid'5000 team. If you encounter problems, please report them (see the Support page). Additionally, as it is a wiki page, you are free to make minor corrections yourself if needed. If you would like to suggest a more fundamental change, please contact the Grid'5000 team.


Introduction

Conda is an open source package management system and environment management system for installing multiple versions of software packages and their dependencies and switching easily between them. It works on Linux, OS X and Windows, and was created for Python programs but can package and distribute any software.

To get started with Conda, have a look at this Conda cheat sheet and this Getting Started with Conda guide.

Conda, Miniconda, Anaconda ?

  • conda is the package manager.
  • miniconda is a minimal python distribution for conda that includes base packages
  • anaconda is another python distribution for conda that includes 160+ additionnal packages to miniconda

On Grid'5000, we installed conda using the miniconda installer, but you are free to create an anaconda environment, using the anaconda meta-package.

More information about Miniconda vs Anaconda is available on the Conda website.

Conda or Mamba?

mamba is a reimplementation of the conda package manager in C++. Conda has a reputation for taking time when dealing with complex sets of dependencies. Mamba is much more efficient and is fully compatible with Conda packages and supports most of Conda's commands. It consists of:

  • mamba: a Python-based CLI conceived as a drop-in replacement for conda, offering higher speed and more reliable environment solutions
  • micromamba: a pure C++-based CLI, self-contained in a single-file executable
  • libmamba: a C++ library exposing low-level and high-level APIs on top of which both mamba and micromamba are built

Mamba on Grid'5000

Like Conda, Mamba is available as a module on Grid'5000:

Terminal.png frontal:
module load mamba

Then, since its syntax is generally compatible with Conda, you can use the mamba command where you would use the conda command.

Conda on Grid'5000

Conda is already available in Grid'5000 as a module. You do not need to install Anaconda or Miniconda on Grid'5000 !

Load Conda module

  • To make it available on a node or on a frontend, load the Conda module as follows (default version):
Terminal.png frontal:
module load conda

Optional: Conda initialization and activation in your shell

Conda initialization is the process of defining some shell functions that facilitate activating and deactivating Conda environments, as well as some optional features such as updating PS1 to show the active environment. It is not required to use Conda.

The conda shell function is mainly a forwarder function. It will delegate most of the commands to the real conda executable driven by the Python library.

There are two ways to initialize conda in standard installation:

  • 1. occasionally : activate conda in your current shell (ex: bash)
Terminal.png $:
eval "$(conda shell.bash hook)"
  • 2. always : activate conda in your login shell environment permanently (this command modifies your .bashrc by adding conda setup directives)
Terminal.png $:
conda init
Warning.png Warning

bash is the default shell for conda.

For users using tcsh or zsh use :

  • eval "$(conda shell.{tcsh,zsh} hook)"
  • conda init {tcsh,zsh}

In Grid'5000, the conda initialization is made transparently by loading the conda module.

The conda activate or conda deactivate commands relies on the conda shell initialization to load/unload the corresponding conda environment variables to the current shell session.

By default, you are located in the base Conda environment that corresponds to the base installation of Conda.

If you’d prefer that conda’s base environment not be activated on startup, set the auto_activate_base parameter to false:

Terminal.png $:
conda config --set auto_activate_base false

Verify your conda configuration with this command:

Terminal.png $:
conda config --show

Look at all available configuration options with:

Terminal.png $:
conda config --describe

Conda environments

Conda allows you to create separate environments containing files, packages, and their dependencies that will not interact with other environments.

When you begin using conda, you already have a default environment named base. You can create separate environments to keep your programs isolated from each other. Specifying the environment name confines conda commands to that environment.

Warning.png Warning

The base environment is stored in a read-only directory as shown by conda info command That's why you need to systematically create your own conda environments to install the software you need.

  • List all your environments
Terminal.png $:
conda info --envs

or

Terminal.png $:
conda env list
  • Create a new environment
Terminal.png $:
conda create --name ENVNAME
  • Activate this environment before installing package
Terminal.png $:
conda activate ENVNAME

For further information:

Conda package installation

In its default configuration (the default Conda channel), Conda can install and manage the over 7,500 packages at https://repo.anaconda.com/pkgs/ that are built, reviewed, and maintained by Anaconda.

Terminal.png $:
conda install <package>
  • Install specific version of package:
Terminal.png $:
conda install <package>=<version>
  • Uninstall a package:
Terminal.png $:
conda uninstall <package>

For further information:

Conda package installation from channels

Channels are the locations of the repositories where Conda looks for packages. Channels may point to a Cloud repository or a private location on a remote or local repository that you or your organization created. Useful channels are:

To install a package from a specific channel:

Terminal.png $:
conda install -c <channel_name> <package>
  • List all packages installed with their source channels
Terminal.png $:
conda list --show-channel-urls

For further information:

Warning.png Warning

Installing Conda packages can be time and resources consuming. Preferably use a node (instead of a frontend) to perform such an operation. Note, using a node is mandatory if you need to access specific hardware resources like GPU.

Application examples

Create an environment

For example create environment <env_name> (specify a Python version; otherwise, it is the module default version)

Terminal.png fgrenoble:
conda create -y -n <env_name> python=x.y

Load this environment

Terminal.png fgrenoble:
conda activate <env_name>

Install a package into

Terminal.png fgrenoble:
conda install <package_name>

Exit from the loaded environment

Terminal.png fgrenoble:
conda deactivate

Remove unused Conda environments

Warning.png Warning

Conda packages are installed in $HOME/.conda. You could, therefore, rapidly saturate your homedir quota (25GB by default). Do not forget to occasionally remove unused Conda environment to free up space.

  • To delete an environment
Terminal.png fgrenoble:
conda deactivate
conda env remove --name <env_name>
  • To remove unused packages and the cache. Do not be concerned if this appears to try to delete the packages of the system environment (ie. non-local).
Terminal.png fgrenoble:
conda clean -a

Use a Conda environment in a job

As seen in the previous section, the Conda environment is stored by default in user's homedir (at ~/.conda). Once the environment is created and packages installed, it is usable on all nodes from the given site.

For interactive jobs

Load, init, and active you conda environment env_name in an interactive job

Terminal.png frontal:
oarsub -I
Terminal.png node:
module load conda
conda activate env_name

For batch jobs

Load, initialize, and active you conda environment env_name in a batch job

First prepare your conda environment on the frontend:

  • module load and conda initialization
  • conda creation of an environment testconda containing gcc from conda-forge channel
  • list installed packages with source info
Terminal.png fsiteA:
module load conda

conda create --name testconda
conda activate testconda

conda install -c conda-forge gcc_linux-64 gxx_linux-64
  • launch this commands and keep output result
Terminal.png fsiteA:
conda info
conda list -n testconda --show-channel-urls

In this example, we launch a job that does the same tasks but in batch job.

  • The important step is to source shell environment to execute module and activate conda
Terminal.png fsiteA:
oarsub 'bash -l -c "module load conda ; conda activate testconda ; conda info ; conda list -n testconda --show-channel-url"'
OAR_JOB_ID=1539228
  • Is job finished ?
Terminal.png fsiteA:
oarsub -C 1539228
# Error: job 1539228 is not running. Its current state is Finishing.
  • Compare output with the previous one : they should be identical
Terminal.png fsiteA:
cat OAR.1539228.std

Advanced Conda environment operations

Synchronize Conda environments between Grid'5000 sites

  • To synchronize a Conda directory from a siteA to a siteB:
Terminal.png fsiteA:
rsync --dry-run --delete -avz ~/.conda siteB.grid5000.fr:~

To really do things, the --dry-run argument has to be removed and siteB has to be replaced by a real site name.

Share Conda environments between multiple users

You can use two different approaches to share Conda environments with other users.

Export an environment as a yaml file

  • Export it as follows:
Terminal.png fgrenoble:
conda env export > environment.yml
  • Share it by putting the yaml file in your public folder
Terminal.png fgrenoble:
cp environment.yml ~/public/
  • Other users can create the environment from the environment.yml file
Terminal.png fgrenoble:
conda env create -f ~/<login>/public/environment.yml
  • Advantage : it prevents other users from damaging the environment if they add packages that could conflict with other packages and/or even delete packages that another user might need.
  • Inconvenient : it's not a true shared environment. The environment is duplicated on other users' home directory. Any modification on one Conda environment will not be automatically replicated on others.

Use a group storage

Group Storage gives you the possibility to share a storage between multiple users. You can take advantage of a group storage to share a single Conda environment among multiple users.

  • Create a shared Conda environment with --prefix to specify the path to use to store the conda environment
Terminal.png flyon:
conda create --prefix /srv/storage/storage_name@server_hostname_(fqdn)/ENVNAME
  • Activate the shared environment (share this command with the targeted users)
Terminal.png flyon:
conda activate /srv/storage/storage_name@server_hostname_(fqdn)/ENVNAME
  • Advantage : It avoids storing duplicate packages and makes any modification accessible to all users
  • Inconvenients :
    • Users could potentially harm the environment by installing or removing packages.
    • When installing additional packages, conda still stores them in the package cache located in your home directory. Use conda clean as described above to clean those files.
  • Create your environments by defaut in a group storage location

You can modify you ~/.condarc file to specify this location for conda environment and package installation as follow (change the location to suit your group and your convenience). Add this lines:

pkgs_dirs:
  - /srv/storage/storage_name@server_hostname_(fqdn)/conda_shared_envs/pkgs/
envs_dirs:
  - /srv/storage/storage_name@server_hostname_(fqdn)/conda_shared_envs/envs/

Build your HPC-IA framework with conda

Here are some pointers to help you set up your software environment for HPC or AI with conda