Difference between revisions of "How to submit jobs on Combo"

From Computational Biophysics and Materials Science Group
Jump to: navigation, search
m
m
Line 3: Line 3:
  
 
After logging into to the cluster, the user is on the master node. When a program is run, it is also immediately run on the master. This is the "interactive mode", which is convenient for running simple commands like ls, vi, etc. or for editing/compiling a program. But, long computing jobs should be submitted through the queuing system. The submitted job will be in a queue waiting for its turn, then will be sent to one or more compute node(s), which the job will have dedicated access to until it finishes. Therefore, the job will run faster and the cluster will be more efficiently utilized.
 
After logging into to the cluster, the user is on the master node. When a program is run, it is also immediately run on the master. This is the "interactive mode", which is convenient for running simple commands like ls, vi, etc. or for editing/compiling a program. But, long computing jobs should be submitted through the queuing system. The submitted job will be in a queue waiting for its turn, then will be sent to one or more compute node(s), which the job will have dedicated access to until it finishes. Therefore, the job will run faster and the cluster will be more efficiently utilized.
 
==Basic Commands==
 
Some basic commands that every cluster user should know before they start running jobs on these system:
 
{| class="wikitable"
 
! scope="col"|Command
 
! scope="col"|Description
 
|-
 
|qsub
 
|To submit a job to the queuing system
 
|-
 
|qdel
 
|To delete a job that has been submitted to the queuing system
 
|-
 
|qstat/showq
 
|List all information about queues and jobs
 
|}
 
  
 
==Sample PBS job scripts==
 
==Sample PBS job scripts==

Revision as of 14:58, 30 July 2014

Queuing System

For the efficient use of the cluster, two Monitoring/Job Management software (PBS/Torque and Maui) have been installed.

After logging into to the cluster, the user is on the master node. When a program is run, it is also immediately run on the master. This is the "interactive mode", which is convenient for running simple commands like ls, vi, etc. or for editing/compiling a program. But, long computing jobs should be submitted through the queuing system. The submitted job will be in a queue waiting for its turn, then will be sent to one or more compute node(s), which the job will have dedicated access to until it finishes. Therefore, the job will run faster and the cluster will be more efficiently utilized.

Sample PBS job scripts

PBS job script for Parallel OPENMPI

PBS job script for Serial job

Submit Your Jobs

Submit your batch job from the frontend with the command

qsub [job_script]

You get the job_name and job_id assigned, which can be used with various command.

Monitor Your Jobs

To see the progress information of running jobs, the command showq(Maui) and qstat(Torque) can be used. Both commands give you a summary of the status of submitted jobs and queues They give slightly different types of information. qstat shows a list of all running and waiting jobs in the queue, sorted by job identifier.

Please note that sometimes it takes a minute for submitted job to showq up under showq.

Another difference is that qstat shows time used for running jobs, while showq displays time left until the job will be killed by the queue system. When a job has finished it will no longer appear in the qstat or showq output.

Besides, the web based cluster monitor Ganglia is a very helpful tools to monitor the compute-node loading/status. Go to http://combos.tk/ganglia and from there you could view the status of Combo

To delete a running job, use

qdel [jobid]

Frequently Used PBS Command

PBS supplies a command line interface. This is used to submit, monitor, modify, and delete jobs. The following are some frequent used PBS user commands and their functions:

Command Description
qsub Submit a job
qstat List all information of queues and jobs
qdel jobid Delete a job

Frequently Used qsub option

Command Description
qsub -l list Set job resource list
qsub -N <jobname> Set job name to <jobname>
qsub -q <queue_name> Submit to queue <queue_name>

The resource requested on command line has a high preference than the directive line in the script file. For an example, submit job by command

qsub -l nodes=2:ppn=4 [jobscript]

this job will run on 2 compute nodes with 4 processors each instead of what stated in the script file.

qsub -l nodes=compute-0-0:ppn=16 [jobscript]

this job will run on the specified node (compute-0-0 in this case) with 16 processors.

Frequently Used qstat option

Command Description
qstat -a List all jobs with details
qstat -q List all queues on the system
qstat -n List all jobs with node information
qstat -u userid List all jobs owned by user userid
qstat -r List all running jobs
qstat -f jobid List all information known about specified job(jobid)

Acknowledgement

With reference to: