User Tools

Site Tools


language_lab:cluster

This is an old revision of the document!


Smallvoice - the Language and Voice lab computing cluster


Smallvoice, uses Slurm workload manager to create a computing cluster
The Cluster has 6 nodes :

Node name Role(s)
atlasmanagement node, worker
freedomlogin node, worker
herculesworker node
samsonworker node
goliathworker node
obelixworker node

When logged on to the cluster user is always on the login node, freedom and does all his work there.
/home (& work) are hosted on a NFS server, so every nodes have the same “physical” disks

The computing (slurm) environment

There are 3 partitions / queue's available

NameCoresMemory(gb)NodesGPUTimelimitUsage
allWork16+18+12+1264+40+48+484Nvidia A100 GPU7 daysstaff only
doTrain16+18+12+1864+40+48+404Nvidia A100 GPUno limitstaff only
beQuick18+1240+482Nvidia A100 GPU36 hoursfor students

The default queue for staff is doTrain (and beQuick for student) so it's not necessery to choose a queue, but it's possible to specify a different one.

Installed software and drivers

* NVIDIA A100 GPU drivers
* CuDA toolkit [version 11.7]
* Intel oneAPI Math Kernel Library
* Python 3.9.2
* pip 20.3.4

language_lab/cluster.1672918672.txt.gz · Last modified: 2024/10/14 14:24 (external edit)