Singularity Container Pilot

Document Author: Simon Butcher, QMUL Research IT

Updated: 6th June 2017

Version: 1.2.2

1. Summary

This document describes the singularity pilot phase on Apocrita, how to take part, and how to get started with running applications in a container.

2. Singularity Container Repository

During this test phase, container template files will be stored in the testing folder at https://github.research.its.qmul.ac.uk/itsresearch/singularity-containers along with necessary files to build the container. If you have an Apocrita account but can't login to the QMUL github, please raise a support ticket via its-research-support@qmul.ac.uk.

To contribute files or make changes to this repository, end users will be required to fork the repository and submit a pull request to merge the changes.

If this is a bit complicated, you can send us the path to the relevant files (.def and .qsub) on apocrita and we'll add them for you.

3 Known Issues

3.1 Overlayfs not supported on SL6.2

Scientific Linux 6.2, which Apocrita currently uses, does not support overlayfs, hence any directories we want to mount inside the container need to be manually created in the bootstrap phase by adding a mkdir command in the %post section of the Singularity def file. One such example is the home directory, which on Apocrita is /data/home. Hence, for seamless home directory mounting on CentOS6 nodes (such as Apocrita currently), we must create /data in the %post section of the .def file:

%post
    mkdir /data

CentOS 7

We are in the process of moving Apocrita to CentOS 7, which does not suffer from this issue.

3.2 Issues when bootstrapping containers for other O.S. Versions

When building containers for other Linux distributions, or other versions of the same distribution, some extra settings or packages may be required.

For example, building a CentOS6 package on a CentOS7 machine requires additional commands adding to the %post section of the .def file

RELEASEVER=6
ARCH=x86_64
echo $RELEASEVER > /etc/yum/vars/releasever
echo $ARCH > /etc/yum/vars/arch
rpm --rebuilddb

Similarly this is required if building a CentOS container on Ubuntu. We are trying to discover if this is a widespread issue, and if there are any better workarounds.

Bootstrapping an Ubuntu container on CentOS requires the debootstrap package to be installed, which allows debian/ubuntu systems to be built without apt/dpkg.

3.3 Package installation Log messages appearing to come from the host server

Currently syslog messages generated inside a container (such as package installs) appear to come from the host machine in the host system logs. A feature request has been raised to optionally allow us to mute bootstrap log messages, or send them to a file.

4 Guidelines

4.1 Use of foreign container formats

For compatibility, trust and repeatability, we want to refrain from loading docker images, instead to create singularity def files (and store them in the github repository for re-use), and create images from the def files. It is usually quite easy to inspect a docker template file and produce a singularity template from it.

4.2 Image sizes

Singularity no longer uses sparse file systems, so the images take up all of the allocated space on the disk. Oversized containers are a waste of resources. Since containers can easily be grown, container sizes should be kept to a reasonable minimum, since they are meant to be static.

USAGE: singularity [...] expand [expand options...] <container path>
 -s/--size   Specify a size for an operation in MiB (default 512MiB)
EXAMPLES:
   $ singularity expand --size 10240 /tmp/Debian.img
   $ singularity expand ./CentOS7.img

4.3 One primary app per container

Each container should have the primary purpose of running one application/pipeline, and should have only the necessary dependencies to run the tools. This helps ensure that containers are easily supportable and usable by as many people as possible. Separate tools can be built in other containers as required.

5 Restrictions

5.1 Elevated permissions

Elevated permissions are required to build and bootstrap Singularity containers for singularity releases older than 2.3.

Root privileges are required to create images and install (bootstrap) containers in the v2.2 release, but the 2.3 version allows users to create containers without sudo privileges.

5.2 Development module

During the pilot phase, we are keeping the singularity module as a development module.

Loading the singularity module during this phase requires loading of the use.dev module first to then access the development modules

module load use.dev
module load singularity

6 Using singularity

6.1 How to build a container - example

Using the following basic python3 def file:

BootStrap: yum
OSVersion: 7
MirrorURL: https://www.mirrorservice.org/sites/
mirror.centos.org/%{OSVERSION}/os/$basearch/
UpdateURL: https://www.mirrorservice.org/sites/
mirror.centos.org/%{OSVERSION}/updates/$basearch/
Include: yum


%post
    yum -y install epel-release
    yum -y install python34

%runscript
    python3.4 $@

We first create an image, (note that creation and bootstrapping require sudo):

sudo singularity create centos7-python-3.4.el7.img

Then we bootstrap the image:

sudo singularity bootstrap ./centos7-python-3.4.el7.img \
singularity-containers/official/python/centos7-python-3.4.el7.def

This will result in an image that can then be used. Bear in mind that if you want a very specific version of package from a repository, that specific package may not be available in future, so where possible, try to future-proof your containers.

6.2 Shell access to the container

We can open an interactive shell to the container:

$ singularity shell ./centos7-python-3.4.el7.img
Singularity: Invoking an interactive shell within container...

Singularity.centos7-python-3.4.el7.img> python3.4

Python 3.4.3 (default, Aug  9 2016, 17:10:39)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>

6.3 Execute a script from outside the container

For typical use, you want to use runscripts or exec commands, especially when submitting the work via Grid Engine. The following example runs a python script from the current directory:

$ singularity exec ./centos7-python-3.4.el7.img python hello_world.py
Hello, World!

6.4 Runscripts

Note that you can configure a singularity container to run a pre-defined command. This is configured during the bootstrap process, and creates a file called /singularity inside the container, allowing you to execute the container and pass your own parameters:

$  ./centos7-python-3.4.el7.img
Python 3.4.3 (default, Aug  9 2016, 17:10:39)
[GCC 4.8.5 20150623 (Red Hat 4.8.5-4)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>>
$  ./centos7-python-3.4.el7.img ./hello_world.py
Hello, World!

You can also use the singularity run <container> command.

6.5 Using with Grid Engine

One of the major benefits of Singularity is the simplicity with which it can be used in an HPC environment. Your Grid Engine qsub file needs only to load the singularity module and you can run your container. Bear in mind that the RAM and runtime might be a little higher than native code.

Example:

#!/bin/bash
#$ -cwd
#$ -S /bin/bash
#$ -pe smp 1
#$ -l h_vmem=2G

### create a PDF from a markdown file called example.md
module load use.dev
module load singularity
singularity exec ~/containers/pandoc.img pandoc example.md -o output.pdf

Another example involving a more complex command. First a container was built to run CentOS6 and python 2.7. CentOS6 uses python 2.6.6, and python2.7 is only available via the CentOS Software Collections. To run these different versions we need to use a special command scl enable.

centos6.def

BootStrap: yum
OSVersion: 6
MirrorURL: https://www.mirrorservice.org/sites/
mirror.centos.org/%{OSVERSION}/os/$basearch/
UpdateURL: https://www.mirrorservice.org/sites/
mirror.centos.org/%{OSVERSION}/updates/$basearch/
Include: yum

%post
    # necessary for centos6 due to lack of overlayfs
    mkdir /data
    yum -y install centos-release-SCL
    yum -y install vim python27

%runscript
    scl enable python27 "python2.7 $@"

python27test.qsub

#!/bin/bash
#$ -cwd
#$ -l h_rt=12::
#$ -l h_vmem=4G
module load use.dev
module load singularity
singularity exec centos6.img scl enable \
python27 "python python_prime.py"

Alternatively, because of our runscript, we could have used the singularity run python_prime.py feature.

6.6 Getting help

In addition to the singularity website, commands such as singularity help and singularity exec help provide documentation. Please raise a support ticket with our team at its-research-support@qmul.ac.uk for additional queries.

6.7 Version of singularity installed

We will have the latest release available as the default stable version to be loaded when module load singularity is called after loading the use.dev module. There will also be a recent installation available from the development repository for bleeding-edge testing, in a module labelled singularity/git. This will have latest features and bug fixes, but comes with greater risk of things breaking. If there is a bug affecting a few users before the next release, we may make an additional git snapshot version and make that the default.

6.8 Using QM-supplied containers

Some official containers will be placed in /data/containers, and documentation will be supplied on the HPC documentation site

7 Best practices

When bootstrapping a container, it is best to consider the following:

  • Install packages, programs, and data into operating system locations (not /home, /tmp, or any other directories that might get squashed by a bind mount). Typically applications not added by package managers end up in the /usr/local tree.
  • If you require any special environment variables to be defined, add them the /environment file inside the container (with the exception of SINGULARITY_LD_LIBRARY_PATH which is a special environment variable which may be set outside of singularity, because the LD_LIBRARY_PATH otherwise gets unset for security reasons by the OS)
  • Files added to the system should always be owned by a system account (User ID < 500) not a user.
  • Ensure that the container’s /etc/passwd, /etc/group, /etc/shadow, and other sensitive files have nothing but the bare essentials within them.
  • Do all of your bootstrapping via the definition file for support and repeatability * no manual hacks, or passing containers around instead of .def files. Anyone should be able to build the same container, given the same .def file.
  • If older versions of a package are required, it is recommended to use http://vault.centos.org, which is a snapshot of the older trees that have been removed from the main CentOS servers as new point releases are released.
  • Submit your .def files to the github repository to help others and get better support: https://github.research.its.qmul.ac.uk/itsresearch/singularity-containers

8 Future Plans

After a suitable pilot phase (timespan depending on level of activity and issues), if there are no outstanding significant issues, and the software meets some or all of the needs of the users, we will look to deploy into production.

This will involve further documentation and tutorials, guidelines for use, a dedicated GPFS fileset for official images, in the same way that we supply official applications via /share/apps.

The singularity developers are working on a community container hub, a repository for a wide range of container images and definitions for your own use, similar to Docker hub.

9 Pilot

The aim of the pilot is to try the software with a small set of users, under the understanding that in the course of real-world use, we may experience some issues that will need further examination, and assistance will be required in working through and replicating these issues.

Users joining the trial should understand that:

  • they may experience issues with the software that may cause it to partially or totally malfunction.
  • Where possible, Research data generated by the software should be compared against native applications or other systems for validation of results.
  • Where possible, job performance and resource usage should be noted (along with scheduler job id), and compared with like for like native applications on the same hardware (realising that the main benefit of singularity is to provide applications that aren’t available natively)
  • Pilot users should be willing to interact with support staff in providing information during the trial relating to the nature of jobs run, performance, and troubleshooting of issues.

Anyone interested in the pilot should register their interest by emailing its-research-support@qmul.ac.uk, stating their expected use for the containers, and they will be added to the singularity@qmul.ac.uk mailing list for wider discussion amongst all pilot users, and relevant announcements during the pilot.

Until we have a clearer process regarding container builds, you will need to build containers on a machine you have root on, and upload them to apocrita, or ask ITS Research to build the container, given a particular def file. Ideally you should build a container on a VM running the same OS that will run inside container. A Vagrant configuration file is supplied in the singularity containers repository to allow building of a Virtualbox VM running CenOS7 and singularity.

Please feedback your experiences, and questions via its-research-support@qmul.ac.uk and/or the mailing list. This document will continue to be developed during the pilot, based upon feedback.

10 Citations

Research performed using singularity should use referenced DOI for the most recent release:

10.5281/zenodo.60736

Published research should reference Apocrita using:

This research utilised Queen Mary's MidPlus computational facilities, supported by QMUL Research-IT and funded by EPSRC grant EP/K000128/1.

11 Further reading

Singularity official docs