As bioinformaticians, we often have to deal with computer clusters. There are different computer cluster systems available such as SGE, LSF… Although the main idea is the same, the commands used to submit a job might be slightly different. In this article I will quickly show how to submit a job to an SGE (Sun Grid Engine) based cluster.
There are two main commands that are very useful :
- qsub: to submit a job to the cluster
- qstat: to check the jobs you submitted to the cluster
There are two ways to submit a job to a cluster:
1 – Submission via a script
An example of a script is shown below, let’s call the script file jobscript.sh. In this case, it is essential to specify all the parameters as header of the script.
#!/bin/bash #$ -N jobName #$ -q jobQueue #$ -o /path_to/jobs.out #$ -e /path_to/jobs.err cd /folder $YOUR_COMMAND
Here, we are starting a job with the following parameters:
- -N: name of the job
- -q: queue on which to run the job
- -o: path to the output file
- -e: path to the error file
- The rest of the script correspond to the commands we want to run
Some additional helpful headers are as follow:
- -pe smp 8: number of threads to use for the job (here 8)
- -cwd: start the job on current working directory
- -M: email address (to get notifications about the job status)
- -m [b,e,a,s,n]: status to be notified about (beginning, end, aborted, suspended, no mail)
For more information see: http://gridscheduler.sourceforge.net/htmlman/htmlman1/qsub.html
To start your job, you need to run
qsub jobscript.sh
2 – Direct submission from the command line
There is an alternative way to submit a job from the command line.
qsub -N jobName -q jobQueue -o /path_to/jobs.out -e /path_to/jobs.err [ENTER] cd /folder [ENTER] $YOUR_COMMAND [Ctrl+D]
This way the job will be submitted as soon as you press Ctrl+D. You can use several commands by pressing ENTER to separate each of them and start the job by pressing Ctrl+D after the last command.