QNIBTerminal - pure SLURM
The foundation of QNIBTerminal is an image that holds consul and glues everything together. I used the Easter break to refine my qnib/slurm
images - this blog post give a quick intro.
SLURM is a resource scheduler that helps out by freeing your mind from how to use resources in a cluster. Hence the name Simple Linux Utility for Resource Management.
To spin up a cluster three daemons are necessary:
-
Munge (MUNGE Uid 'N' Gid Emporium) creates and validates credentials for authentication.
-
SLURM Controller The SLURM controller (
slurmctld
) is in charge of the actual scheduling. It gathers all information and dispatches the jobs. -
SLURM Daemon The
slurmd
daemon runs on all nodes within the cluster and connects to theslurmctld
, reports the node ready for duty.
Docker Images
For this simple version of a SLURM cluster, three docker containers are needed.
- qnib/consul bundles everything together by providing service discovery and a key/value store
- qnib/slurmctld provides the SLURM Controller
- qnib/slurmd acts as the compute node
SLURM config file
The two slum daemons use a common configuration file slurm.conf
.
This bugger has to be equal among all daemons, otherwise they will complain. Within my first iteration of QNIBTerminal I used etcd and some bash scripts to keep everyone in sync - this time around I use the power of consul. :)
consul services
The containers are reporting all services back to consul. Among them slurmctld
and slurmd
. Within consul, these nodes can be queried using the consul HTTP API:
$ export CHOST="http://consul.service.consul:8500"
$ curl -s ${CHOST}/v1/catalog/service/slurmd|python -m json.tool
[
{
"Address": "172.17.0.217",
"Node": "001a982e8af1",
"ServiceAddress": "",
"ServiceID": "slurmd",
"ServiceName": "slurmd",
"ServicePort": 6818,
"ServiceTags": null
}
]
And if that would be not enough the guys behind consul provide consul-template
to use the information painlessly simple.
consul-template
As the name suggests, consul-tempate
uses consul and creates config files out of it.
Since our SLURM cluster is going to be fairly dynamic (the cluster should grow and shrink if we feel like it) it has to be configured dynamically as well.
The hard part within the slum config file is to get the nodes dynamically created.
$ tail -n5 /usr/local/etc/slurm.conf
NodeName=001a982e8af1 NodeAddr=172.17.0.217
PartitionName=qnib Nodes=001a982e8af1 Default=YES MaxTime=INFINITE State=UP
This information derives out of this template:
$ tail -n6 /etc/consul-template/templates/slurm.conf.tmpl
{{range service "slurmd" "any"}}
NodeName={{.Node}} NodeAddr={{.Address}}{{end}}
PartitionName=qnib Nodes={{range $i, $e := service "slurmd" "any"}}{{if ne $i 0}},{{end}}{{$e.Node}}{{end}} Default=YES MaxTime=INFINITE State=UP
consul-template
gets a list of all nodes providing slurmd
(by default it only takes services into account that are up'n'running, "any" gets them all).
Supervisors holds the serves that listens for changes, recreates the configuration and restarts the service, that's it. :)
$ cat /etc/supervisord.d/slurmd_update.ini
[program:slurmd_update]
command=consul-template -consul consul.service.consul:8500 -template "/etc/consul-template/templates/slurm.conf.tmpl:/usr/local/etc/slurm.conf:supervisorctl restart slurmd"
redirect_stderr=true
stdout_logfile=syslog
fig
Info
FIG was a project back when - the predecessor of docker-compose
.
I must admit I am stuck at fig, I should update to docker-compose - but still...
The following fig file spins up the stack:
consul:
image: qnib/consul
ports:
- "8500:8500"
environment:
- DC_NAME=dc1
- ENABLE_SYSLOG=true
dns: 127.0.0.1
hostname: consul
privileged: true
slurmctld:
image: qnib/slurmctld
ports:
- "6817:6817"
links:
- consul:consul
environment:
- DC_NAME=dc1
- SERVICE_6817_NAME=slurmctld
- ENABLE_SYSLOG=true
dns: 127.0.0.1
hostname: slurmctld
privileged: true
slurmd:
image: qnib/slurmd
links:
- consul:consul
- slurmctld:slurmctld
environment:
- DC_NAME=dc1
- ENABLE_SYSLOG=true
dns: 127.0.0.1
#hostname: slurmd
privileged: true
I do not use a hostname to have a dynamic hostname for each slurmd container.
Logging into the first node I can use the slum commands:
$ docker exec -ti dockerslurmd_slurmd_1 bash
bash-4.2# sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
qnib* up infinite 1 idle 001a982e8af1
bash-4.2# srun hostname
001a982e8af1
By using fig scale
the cluster can be expanded...
$ fig scale slurmd=5
Starting dockerslurmd_slurmd_2...
Starting dockerslurmd_slurmd_3...
Starting dockerslurmd_slurmd_4...
Starting dockerslurmd_slurmd_5...
$ docker exec -ti dockerslurmd_slurmd_1 bash
bash-4.2# sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
qnib* up infinite 5 idle 001a982e8af1,9d8960b0d3ae,46e07712f89e,988187e8255a,e10c39a5ea12
bash-4.2# srun -N 5 hostname
001a982e8af1
e10c39a5ea12
46e07712f89e
988187e8255a
9d8960b0d3ae
And that's it...