I am looking for software to improve my workflow, when running computational (nu
ID: 661300 • Letter: I
Question
I am looking for software to improve my workflow, when running computational (numerical) exeriements.
After seeing how things are done in the super-computer world, with PBS, I want to get some of the benifits for myself.
I want to make job scripts, and then leave them. (Rather than my current, run a job and watch it work, like a mother hen. which is a waste of time.). I want to schedule multiple jobs, and have them run when computer capacity is free.
Basically I have say 8 computation experiments, I can't run them all at once as 2 of them take up 24Gb of RAM, and the rest use say 10Gb each, (and my system only has 32Gb of RAM). My current work is memory limited, but I would expect any job schedualling software to also support CPU (/thread) limited tasks.
I only have one server, but it is very grunty. I am not the only user, however all other users are only doing simple things like running command line chat clients and text editors, most of the time.
OS run on (Wheezy) Debian Linux
Price: Free as in beer. (Gratis). I'm good if it is free for Not for profit or for Research Use.
Must Not: Require it(/its jobs) to be the only thing going on on the server
Should: support python jobscrips
Should: Email me the results. (Including standard out, etc)
Should: Be installable via a debain package manager. Wether that is apt-get, or pip or rubygems. (This isn't a deal breaker, but it would be extremely nice to have)
Does Not: require any kind of support for clustering
Does Not: require any kind of GUI
Only thing I've heard of is TORQUE, and I don't know anyone who has used it, or know much about it.
Explanation / Answer
Take a look at SLURM.
OS has a Debian package (source)
Price: again, it has a Debian package
other things might run on the server
you can submit everything (with as many parameters as you like) as long as it is executable
Email: I don't know if there is direct support for that. But I think that might rather be part of the script you submit. Eventually you can write a wrapper script, that gets submitted to SLURM and does nothing than calling the actual script, receiving the results when this is ready and sending an email.
GUI: I think there are GUIs, but I never have used one as I use SLURM at university over SSH.
Clustering is supported (I'm running speech recognition over a HPC cluster)
Important commands are:
sbatch: submit a job
scancel: cancel a job
squeue: see all jobs
Related Questions
Navigate
Integrity-first tutoring: explanations and feedback only — we do not complete graded work. Learn more.