next up previous index
Next: Using LoadLeveler Steps Up: Specification of LoadLeveler Jobs Previous: Submitting a more complex

Submitting a Number of Dependent Jobs

The postprocessing or preprocessing of data may sometimes be so involved that it should be performed as a separate LoadLeveler job, rather than combined with the main computational task.

The simplest way to procede in such situation would be to submit one LoadLeveler job, then wait for it to finish execution, and then to submit the second job. The submission of the second job could be performed from within the LoadLeveler script of the first job.

The following two scripts split the example from Section 7.4.4 into two steps.

The first script, called env-1.ll uses commands env, grep, and awk to generate a data file, in this case an Emacs Lisp code, which is saved on llenv.el (remember that programs are data, and, in particular, in case of Lisp, there is no semantic difference between programs and data: both are stored in the same data section of a Lisp process, and both can be modified dynamically during program execution). Once awk exists the script checks if the data file is there (for example, an error may have occurred while executing awk). It also checks if the second LoadLeveler script can be found in its working directory. If both files are present, the second script is submitted with the llsubmit command.

gustav@sp20:../LoadLeveler 23:40:22 !751 $ cat env-1.ll
# @ shell = /afs/ovpit.indiana.edu/@sys/gnu/bin/bash
# @ output = env-1.out
# @ error = env-1.err
# @ job_type = serial
# @ class = test
# @ notification = always
# @ environment = COPY_ALL
# @ queue
env | grep LOADL | \
awk ' BEGIN { 
         { printf "(defun llenv ()\n" } 
         { printf "   (princ \"LoadLeveler variables:\\n\") " }
      }
      { printf "   (princ \"\t%s\\n\")\n", $0 } 
      END { print ")"} ' > llenv.el
if [ -f llenv.el -a -f env-2.ll ]
then
   llsubmit env-2.ll
fi
gustav@sp20:../LoadLeveler 23:40:25 !752 $

The second script is called env-2.ll. First it checks if the file llenv.el exists. Even though we have already checked that within env-1.ll, here we do so again, because the scripts are separate, and there is always a possibility that env-2.ll may have been submitted without running env-1.ll first. If the file exists, we run emacs on it, if it doesn't, we flag an error and exit. The data file itself, llenv.el, is removed after emacs had its way with it.

gustav@sp20:../LoadLeveler 23:43:10 !758 $ cat env-2.ll
# @ shell = /afs/ovpit.indiana.edu/@sys/gnu/bin/bash
# @ output = env-2.out
# @ error = env-2.err
# @ job_type = serial
# @ class = test
# @ notification = always
# @ environment = COPY_ALL
# @ queue
if [ -f llenv.el ]
then
   emacs -batch -l llenv.el -f llenv > llenv.out
else
   echo Error: env-2.ll job: llenv.el not found
   exit 1
fi
rm llenv.el
gustav@sp20:../LoadLeveler 23:43:27 !759 $

It is easy to restructure the two scripts above into one script, which first performs one task, then resubmits itself, and on the second invocation performs the second task.

In order to do that, the script must be able to find out on its own, whether its current instantiation is the first or the second one. If you have a creepy feeling now that we are getting close to talking about reincarnation, well, yes, you're quite right. That's exactly what we're talking about! How can a process know that it already lived before?

The answer is: by inspecting its environment and finding a particular variable set. The variable would be set during the first instantiation of the LoadLeveler job. It would not exist at all outside of those LoadLeveler jobs, i.e., the user should make sure that it is unset in the user's normal environment.

Here's the script:

gustav@sp20:../LoadLeveler 23:51:33 !803 $ cat env-3.ll
# @ shell = /afs/ovpit.indiana.edu/@sys/gnu/bin/bash
# @ output = $(job_name).out
# @ error = $(job_name).err
# @ job_type = serial
# @ class = test
# @ notification = always
# @ environment = COPY_ALL
# @ queue
if [ -z "$ENV_SECOND_SUBMISSION" ]
then
   env | grep LOADL | \
   awk ' BEGIN { 
            { printf "(defun llenv ()\n" } 
            { printf "   (princ \"LoadLeveler variables:\\n\") " }
         }
         { printf "   (princ \"\t%s\\n\")\n", $0 } 
         END { print ")"} ' > llenv.el
   if [ $? -eq 0 ]
   then
      export ENV_SECOND_SUBMISSION="yes"
      llsubmit $LOADL_STEP_COMMAND
   else
      echo Error: problem executing awk
      exit 1
   fi
else
   emacs -batch -l llenv.el -f llenv > llenv.out
   rm llenv.el
fi
gustav@sp20:../LoadLeveler 23:52:16 !804 $

The script works as follows. The first step is to check if the environmental variable ENV_SECOND_SUBMISSION has been set to something. If not, it means that this instantiation of the job has no ancestor. In this case the script calls env, grep, and awk to create the data file, llenv.el. After awk exits we inspect its exit status, $?, and only if it is 0, we define and export the new environmental variable, ENV_SECOND_SUBMISSION, and the script resubmits itself, because that is what $LOADL_STEP_COMMAND evaluates to. The variable ENV_SECOND_SUBMISSION will be visible in the second instantiation of the job, because of the LoadLeveler #@environment=COPY_ALL directive.

If the environmental variable ENV_SECOND_SUBMISSION is found to have been set to a non-zero string, the second clause of the if statement is executed. Within that clause we invoke emacs on the llenv.el file. The file is removed after emacs exits.

Observe that the #@output and #@error directives have been defined in terms of $(job_name) this time. Each instantiation of the script will have a different $(job_name), so that the output and error files for the second instantiation of the job will not overwrite output and error files written by the first instantiation of the job. That is important in case any execution problems arise.



 
next up previous index
Next: Using LoadLeveler Steps Up: Specification of LoadLeveler Jobs Previous: Submitting a more complex
Zdzislaw Meglicki
2001-02-26