next up previous contents index
Next: KSIX Cluster Middleware Up: SCE: Scalable Cluster Environment Previous: KCAP Web-based Cluster Management   Contents   Index

Subsections


SQMS Batch Scheduling


Overview

SQMS is a simple queueing management system developed to simplify the task management in beowulf cluster. It was a part of SCE (Scalable Cluster Environment, http://smile.cpe.ku.ac.th/research/sce). SQMS provides a set of command to support developer in beowulf cluster environment to submit, query status, deleting his/her program to the computing node(s) in beowulf cluster. SQMS is released under Opensource license (BSD based).


Features

  1. Support any type of cluster matches above requirements
  2. Support both sequential task and parallel task, currently supports only MPI program
  3. Developer can easily creates a new Load Balancing Policy using our API


Requirements

  1. GNU/Linux RedHat 6.0 or later with 2.2.x kernel
  2. rsh and rlogin
  3. pthread
  4. A shared directory which is shared on all nodes in cluster system. SQMS needs to be installed on this type of directory for now.


Installation

  1. Download the package from http://prg.cpe.ku.ac.th/

  2. Unpack SQMS Distribution

    
    # gzip -dc sqms-1.x.x.tar.gz | tar -xvf -
    

    or

    
    # tar xzvf sqms-1.x.x.tar.gz
    

    After unpack, the directory structure of scms will be as listed below.

  3. Compile and install SQMS. Currently, SQMS support only RedHat-6.x or RedHat 7.0. First, change directory to scms directory after unpack and then:

    
    # configure
    

    or

    
    # configure --prefix=<install directory>
    # make
    # make install
    

    where <install directory> is where you want SQMS to install. Default is /usr/software/sqms-<version>

  4. Add <install directory>/sqms-1.x.x-x/bin to your default path.

    
    export PATH=$PATH:<install directory>/sqms-1.x.x-x/bin
    

    Example:

    
    export PATH=$PATH:/usr/software/sqms-1.0b-2/bin
    


Manual


Start SQMS Services

After config SQMS. You can start it by:


# $etcdir/rc.d/init.d/sqms start 
*****************
Start each daemon
*****************
 
Starting SQMS scheduler                                    [  OK  ]
Starting SQMS loadbalancer                                 [  OK  ]
Starting SQMS dispatcher                                   [  OK  ]

Or you may start it manually by:


# $sbindir/scheduler
# $sbindir/loadbalancer
# $sbindir/dispatcher


Stop SQMS Services

If you want to shutdown SQMS. You can do it by:


# $etcdir/rc.d/init.d/sqms stop

or


# killall scheduler loadbalancer dispatcher


Submit job

sqsub - submit a job to SQMS. A job is any executable program. For example


$ sqsub myprog
1

Output and error will be redirected to myprog.1out, myprog.1err. 1 is Job ID. If you want to apply arguments to your program. You should create a shell script contains your program and arguments. For example, you might create a file called 'myprog.sh' which contains


#!/bin/sh 
myprog arg1 arg2 arg3 arg4 arg5

and submit this program to SQMS by


$ sqsub myprog.sh

For more information, try man sqsub


Delete job

sqdel - When you want to delete a job from SQMS. You can do this by:


$ sqdel 1

where 1 is job id. For more information, please try man sqdel


View job status

sqstat - When you want to see your job status. Just type


$ sqstat 
JOBID   OWNER   COMMAND         JOBTYPE STATUS  RUNHOSTS 
28      ssy     test            local   wait 
29      ssy     test            local   wait 
30      ssy     test            local   wait 
3       ssy     test            local   run     amata11 
5       ssy     test            local   run     amata1 
6       ssy     test            local   run     amata2 
7       ssy     test            local   run     amata3 
8       ssy     test            local   run     amata4 
9       ssy     test            local   run     amata5 
10      ssy     test            local   run     amata6 
11      ssy     test            local   run     amata7 
15      ssy     test            local   run     amata11 
17      ssy     test            local   run     amata1 
18      ssy     test            local   run     amata2 
19      ssy     test            local   run     amata3 
20      ssy     test            local   run     amata4 
21      ssy     test            local   run     amata12 
22      ssy     test            local   run     amata1 
23      ssy     test            local   run     amata2 
24      ssy     test            local   run     amata3 
25      ssy     test            local   run     amata4 
26      ssy     test            local   run     amata5 
27      ssy     test            local   run     amata6

For more information, please try man sqstat


Configuration

Normally, SQMS reads configuration from 2 files:

  1. $etcdir/sce/sqms.conf
  2. $etcdir/sce/sqms_hostlist

Since $etcdir is etc directory that was set on compile time using option -with-etcdir. After fresh SCE installation, SQMS RPM will determine the installed hostname automatically. SCE wizard will update sqms_hostlist to match the cluster configuration. If you wish, you can edit sqms_hostlist by hand.


$etcdir/sce/sqms.conf

This file should exists on each nodes in the system. If not, user will not be able to submit the job from the host other than installed host.

$etcdir/sce/sqms_hostlist This file contains hostlist of the cluster. One node per line. You can weight the host by adding duplicate entry in the list.


Example

/etc/sce/sqms.conf



############################################################################# 
#Configuration file for SQMS 
#$Id: sqms.tex,v 1.5 2001/06/21 08:55:22 b40sup Exp $ 
############################################################################# 
 
#Below is the current working directory of SQMS. If not set. SQMS 'll 
#determine the working directory from the program's name 
#working_directory   /usr/local/share/sqms 
 
#Default scheduler, dispatcher, load balancer address and port 
#These daemon might be in different host. But the hosts should shared the 
#working directory of SQMS 
 
#Scheduler address 
scheduler_host    amata1.cpe.ku.ac.th 
scheduler_port    3456 
 
#Dispatcher address 
dispatcher_host   amata1.cpe.ku.ac.th 
dispatcher_port   3457 
 
#Load balancer address 
loadbal_host      amata1.cpe.ku.ac.th 
loadbal_port      3458 
 
#Define Maximum execution job 
max_exec_job       20 

sqms_hostlist amata1 
amata2 
amata2 
amata2 
amata3 
amata4 
amata5 
amata6 
amata7 
amata8 
amata9 
amata10 
amata11 
amata12

In this example. amata2 will have more job than other host


next up previous contents index
Next: KSIX Cluster Middleware Up: SCE: Scalable Cluster Environment Previous: KCAP Web-based Cluster Management   Contents   Index
Sugree Phatanapherom
2001-06-21
I also have a line of punk t-shirts and art t-shirts featuring Bas Couture, artcore designs