These are more for myself than anything else, but someone might find them useful as it's difficult to find SGE documentation amid all the LSF stuff too. e.g. qsub is the same for both, but different parameters.
With a default installation of SGE, here are the changes made for the cluster in IMAPS;
This may not be the best, however it is better than FIFO.
Activate the functional share policy by specifying the number of functional share tickets. The value is arbitrary in that any value greater than zero will trigger it. But it needs to be suitably larger so that they can be shared out to users evenly. e.g. if you have 10 users, and each has 100 tickets, then it needs to be at least 10000. So, as root (or SGE admin), edit the config file;
and set the following;
Now we need to assign users some tickets. To do this, edit a different config;
and set the following;
This gives each user 100 tickets.
NOTE: When I did this I played around a lot to see its behaviour. I found that once a user has submitted a job through SGE without the assignment of 100 tickets (e.g. if you remove that to see what happens), they will never get these assigned after you turn it on. So you need to add the shares for that user manually. (there may be a better way).
and that will give you something like;
This is simple in practice, but it has a couple of issues to be aware of. I did this using the popular h_vmem
method. However, I may change this at some point. The reason being is that it assumes that h_vmem is both what we want to use, and what the limit is. This may not be the case. e.g. If you have a job that initialises for a few mins, peaks at 4GB, but then only uses 2G for the next 2 weeks, then it'd be a waste of resources for an eight core machine with 16GB ram (e.g. the nodes I have on the IMAPS machine). For now this will be the case until I can give it some serious thought.
First you need to make sure that h_vmem is a consumable resource;
Now that you've done that, you need to add this resource as a complex to each node like so. (clearly you can write a script to do it to every node).
As you can see, I've added h_vmem=16G
to node001. This is the amount of consumable memory that can be allocated.
WARNING: Once you do this, h_vmem
HAS to be set on all jobs, otherwise they will fail. To combat this for the forgetful, lets add a default value by editing the sge_request
file. So locate and open the file in your editor;
Add to the bottom the following;
And this will now give a 2G limit to every job unless otherwise stated.
So there is an issue with this method, for some reason IDL won't start even when specifying a very large amount of memory. This is all to do with the h_stack flag. To stop this being an issue, add the following line to the sge_request
file;
Clearly if the stack size is not enough for some programs, then users can specify a larger stack using the -l h_stack
flag. I did have one user running some perl code that needed 512mb stack space.
Change max_reservation from 0 to a number, in this case I've chosen 32
Change default_duration from INFINITY to something very long,
If you discover that a node isn't accepting jobs, here is what it may be;
If you see;
Then we know that node02 is disabled. To bring it back on we basically re-enable a queue, which goes through and enables all nodes in that queue;
You can also of course disable a queue;
NB: This needs to be run on your master node as root.
OK, this is simple. Just type;
This will bring up a vi like editor. Then change the second line;
...to whatever nodes you wish to have on it.
So you notice that there are lots of things in the queue pending. To check what why a job isn't running type;
And it will tell you why.
If you see lots of;
Then type:
This will tell you the status of nodes. If they are in E status it will also tell you why. Usually that a job caused it to stop. e.g.
You can move them out of error by typing;
If you would like to move a job from one queue to another, you can do this by;
Where all.q is the queue you wish to move it to, and 173143 is the job-id number.
Basically I wanted to measure the performance of various job submissions with different thread counts. Each run producted an output and error file in the form file_threadnumber.o12345, where 12345 is the job number and threadnumber is the number of threads the program used. This, quite large, one liner, does the following;
Probably a bit long winded, but who doesn't like a good one-liner...
Things I have found useful;