This is an old revision of the document!

SoT cluster - the School of Technology computing cluster

Smallvoice, uses Slurm workload manager to create a computing cluster

When logged on to the cluster user is always on the login node, called freedom and should do all his work there.
/home are hosted on a NFS server, so every nodes have the same “physical” disks
All user-jobs should run using slurm sbatch job.sh, please do not run job locally on the login node

The computing (slurm) environment

There are 4 partitions / queue's and access is dependant on your account

Name	Nodes	GPU	Timelimit	Usage
doTrain	3	Nvidia A100 GPU	no limit	staff only
basic	3	Nvidia A100 GPU	31 hours	for students
bigVoice	2	Nvidia A100 GPU	no limit	-
build	3	Nvidia A100 GPU	no limit	staff only

The default queue for staff is doTrain (and basic for student) so it's not necessery to choose a queue, but it's possible to specify a different one.

Installed software and drivers

* NVIDIA A100 GPU drivers
* CuDA toolkit [version 11.7]
* Intel oneAPI Math Kernel Library
* Python 3.9.7
* pip 20.3.4
* ffmpeg + sox

If additional software is needed or different version, you can ask sysadmin (compute@ru.is) for assistance