Software and Package Management

The standard environment on the MIT Supercloud System is sufficient for most. If it is not, first check to see if the tool you need is included in a module. Modules contain environment variables that set you up to use other software, packages, or compilers not included in the standard stack. If there is no module with what you need, you can often install your package or software in your home directory. If you have explored both these options or are having trouble, contact us.

Modules

Modules are a handy way to set up environment variables for particular work, especially in a shared environment. They provide an easy way to load a particular version of a language or compiler.

To see what modules are available, type the command:

module avail

See a list currently the modules available here.

To load a module, use the command:

module load moduleName

Where moduleName can be any of the modules listed by the module avail command.

If you want to list the modules you currently have loaded, you can use the module list command:

module list

If you want to change to a different version of the module you have loaded, you can switch the module you have loaded. This is important to do when loading a different version of a module you already have loaded, as environment variables from one version could interfere with those of another. To switch modules:

module switch oldModuleName newModuleName

Where oldModuleName is the name of the module you currently have loaded, and newModuleName is the new module that you would like to load.

If you would like to unload the module, or remove the changes the module has made to your environment, use the following command:

module unload moduleName

Finally, in order to use the module command inside a script, you will need to initialize it first.

The following shows a Bourne shell script example:

#!/bin/bash
# Initialize the module command first source
source /etc/profile
# Then use the module command to load the module needed for your work
module load anaconda/2020a

Installing Software/Packages in your Home Directory

Many packages and software can be installed in user space, meaning they are installed just for the user installing the package or software. Often the flag to do this is "--user". The way to do this for Julia, Python, and R packages is described below, for other tools refer to its documentation to find out how to do this.

Julia Packages

Adding new packages in Julia doesn't require doing anything special. On the login node, load a julia module and start Julia. You can enter package mode by pressing the "]" key and entering add packagename, where packagename is the name of your package. Or you can load Pkg and run Pkg.add("packagename").

The easiest way to check if a package already exists is to try to load it by running using packagename. The Pkg.status() command will only show packages that you have added to your home environment. If you would like a list of the packages we have installed, the following lines should do the trick (where v1.# is your version number, for example v1.3):

using Pkg; Pkg.activate(DEPOT_PATH[2]*"/environments/v1.#"); installed_pkgs = Pkg.installed(); Pkg.activate(DEPOT_PATH[1]*"/environments/v1.#"); installed_pkgs

Note: If you are using Jupyter there is an additional step you can optionally do so that Jupyter can find both our installed packages and your own. You can also run this if you are missing a Julia Kernel. First load a Julia module. Then, in a Julia shell, run:

using IJulia
installkernel("Julia MyKernel", env=Dict("JULIA_LOAD_PATH"=>ENV["JULIA_LOAD_PATH"]))

The first part "Julia MyKernel" is what you want to call your kernel, so feel free to change this. The second part makes sure both our packages and any you've installed in your home directory show up on the load path when you use a Jupyter Notebook with this kernel.

Python Packages

Many python packages are included in the Anaconda distribution. The quickest way to check this is to load the anaconda module, start python, and try to import the package you are interested in.

If the package you are looking to install is not included in Anaconda, then you can install it in user space- this allows you to install the package for you to use without affecting other users.

First, load the Anaconda module that you want to use if you haven’t already:

module load anaconda/2020a

Here we are loading the 2020a module with python 3.6, feel free to load any version of Anaconda you like. Then, install the package as you normally would, but with the --user flag:

pip install --user packageName

Where packageName is the name of the package that you are installing.

Note: If you are using a conda environment and would like to install the package with pip in that environment rather than in the standard home directory location, you should not include the --user flag. Further, if you are using a conda environment and want Python to use packages in your environment first, you can run the following two commands:

unset PYTHONPATH
export PYTHONNOUSERSITE=True

The first of these will make sure your conda environment packages are used before the packages installed with the Anaconda module, and the second will make sure your conda environment packages will be chosen before those that may be installed in your home directory. If you are using Jupyter, you will need to add these lines to the .jupyter/llsc_notebook_bashrc file. See the section on the bottom of the Jupyter page for more information.

R Libraries

There are two different ways we recommend that you use R. First, is using a preset R environment that comes with the anaconda module, second would be to create your own R conda environment. This first way works best if you don’t need to install any additional packages than what we already have.

To use our R conda environment, log in and load an anaconda module. Then you can activate the R environment with source activate. You can see what packages are installed with the conda list command. Any packages that start with r- are R libraries.

module load anaconda/2020a
source activate R
conda list

Then you can use R as you did before.

If you need to install additional packages, it’s best to do it in a new conda environment.

First thing to know is that many R packages are available through conda, and some are not. What you want to do is include as many R libraries that you’ll need as you can when you create your conda environment- this helps avoid version conflicts. Conda r libraries are all prefixed with r-, so for example if you need rjava, you’d search for r-rjava. You can check if conda has a library with the command: conda search r-LIBNAME, where LIBNAME is the name of the library you’re looking for. You’ll see a lot of versions, but as long as you see something you should be good to add it.

Once you have a list of all the libraries available through conda, create your conda environment (I’m calling the environment myR, feel free to change that):

conda create -n myR -c conda-forge r-essentials r-LIB1 r-LIB2…

Where LIB1, LIB2, etc are the additional R libraries you’d like to include. Sometimes this step takes a while. It’ll tell you which new packages are going to be installed, and then you can confirm by typing y.

If you have any other libraries that weren’t available through conda, install them now. First activate your new environment and then start R:

source activate myR
R

Then you can install your remaining libraries. You can do some test loads here as well, to make sure the libraries installed properly.

install.packages(“PKGNAME”)
library(PKGNAME)

In Jupyter, you should see your environment show up as a kernel. For a batch job, you’ll have to activate the environment either in your submission script or before you submit the job.