This is an old revision of the document!
SoT cluster - the School of Technology computing cluster
Smallvoice, uses Slurm workload manager to create a computing cluster
When logged on to the cluster user is always on the login node, called freedom and should do all his work there.
Home folder for all users are hosted on a NFS server, so every nodes have the same “physical” disks
All user-jobs should run using slurm sbatch job.sh, please do not run job locally on the login node
The computing (slurm) environment
There are 4 partitions / queue's and access is dependant on your account
Name | Nodes | GPU | Timelimit | Usage |
---|---|---|---|---|
doTrain | 3 | Nvidia A100 GPU | no limit | staff only |
basic | 3 | Nvidia A100 GPU | 31 hours | for students |
bigVoice | 2 | Nvidia A100 GPU | no limit | - |
build | 3 | Nvidia A100 GPU | no limit | staff only |
The default queue for staff is doTrain (and basic for student) so it's not necessery to choose a queue, but it's possible to specify a different one.
Installed software and drivers
* NVIDIA A100 GPU drivers
* CuDA toolkit [version 11.7]
* Intel oneAPI Math Kernel Library
* Python 3.9.7
* pip 20.3.4
* ffmpeg + sox
If additional software is needed or different version, you can ask sysadmin (compute@ru.is) for assistance