VisIt is comprised of different cooperating components. To unify them under one "visit" command, a launch script is used. The launch script encapsulates version-specific coding and takes care of setting up environment variables and all of the things that must be in place for VisIt to run properly. The launcher also translates various command line arguments into MPI job submission commands, enabling VisIt to run in parallel under many job-control systems. The main reason that the launch is handled by a script to allow for customizations.
- 1 Organization
- 2 VisIt 2.6
- 2.1 Customization
- 2.2 JobSubmitter
- 2.3 Debugger
- 2.4 MainLauncher
- 2.5 Differences from older versions
VisIt's launch scripts are comprised of 2-3 python scripts: a rarely changed frontendlauncher or visit script and a version-specific internallauncher script. VisIt 2.6 and later also permit the use of a customlauncher script to allow for site-specific customizations.
The frontendlauncher is aliased to visit and is the command that users run. The frontendlauncher script takes care of the following:
- version selection
- architecture selection
In a typical VisIt installation, the binaries for many versions and platforms can coexist. The top level bin directory contains the frontendlauncher and visit command. The top level directory also contains several version subdirectories, each containing at least one architecture. The frontendlauncher selects the newest version for the appropriate architecture.
visit/bin visit/2.4.0 visit/2.4.0/bin visit/2.4.0/linux-intel visit/2.4.0/darwin-x86_64 visit/2.5.0 visit/2.5.0/bin visit/2.5.0/linux-intel visit/2.5.0/darwin-x86_64
Within each version subdirectory, there is also a bin directory containing the internallauncher. The internallauncher is a script that actually takes care of setting up and executing the VisIt programs. The frontendlauncher runs the internallauncher. This division lets different VisIt versions have version-specific modifications in the launching code.
The internallauncher takes care of:
- running VisIt component programs
- setting up environment variables
- submitting parallel jobs
- running under a debugger
The customlauncher script is an optional script that allows maintainers to customize the VisIt launch procedure for their site. The customlauncher script will often be used to return custom subclasses of JobSubmitter that alter the way VisIt launches parallel jobs.
Individual customlauncher scripts get installed by visit-install, which copies customlauncher into the internal bin directory next to internallauncher. The customlauncher scripts should be placed in VisIt's source tree in src/resources/hosts/<site> where <site> is the name of a computing site. The site names typically have the same name as the options found in visit-install.
visit/bin visit/bin/frontendlauncher visit/2.6.0 visit/2.6.0/bin visit/2.6.0/bin/internallauncher visit/2.6.0/bin/customlauncher visit/2.6.0/linux-intel visit/2.6.0/darwin-x86_64
VisIt 2.6 introduced a new version of the launch scripts. Whereas before the scripts were written in PERL, the new scripts are written in Python and are better structured for extensibility and customization. In the new scheme, the frontendlauncher runs the internallauncher in the same Python interpreter rather than spawning a new command.
The new internallauncher script contains various Python classes that help launch VisIt commands.
- JobSubmitter classes let VisIt submit a parallel compute engine to a job control system.
- Debugger classes help launch VisIt under a debugger.
- MainLauncher class contains methods that are used to effect a launch
The main function for the internallauncher script is the internallauncher function. The internallauncher function uses a MainLauncher object '(or derived class)' to go through the various steps that are needed to run a VisIt program.
The previous versions of internallauncher had hacks for various HPC centers strewn throughout the script. The most common pattern was to have some top level initialization for a specific site and then various hacks to MPI job submission elsewhere in the script.
The new launch system allows for a customlauncher file that contains a derived class of MainLauncher. The derived class can perform its own top-level specific initialization without polluting the main internallauncher script. Furthermore, since MPI launching has handled by various JobSubmitter classes, the derived MainLauncher class can return its own JobSubmitter classes that contain site-specific tweaks to MPI launching.
Here is a simple example customlauncher script:
# Custom launcher class SiteSpecificLauncher(MainLauncher): def __init__(self): super(SiteSpecificLauncher, self).__init__() def Customize(self): # ---- # Global initialization # ---- if self.sectorname() == "mycluster": paths = self.splitpaths(GETENV("LD_LIBRARY_PATH")) addedpaths = ["/usr/local/compilers/GNU/gcc-4.3.2/lib64"] SETENV("LD_LIBRARY_PATH", self.joinpaths(paths + addedpaths)) # Launcher creation function def createlauncher(): return SiteSpecificLauncher()
Here is a simple example that returns a custom JobSubmitter for mpirun:
# Custom mpirun job submitter class JobSubmitter_mpirun_custom(JobSubmitter_mpirun): def __init__(self, launcher): super(JobSubmitter_mpirun_custom, self).__init__(launcher) # # Override the name of the mpirun executable, give it arguments # def Executable(self): return ["/my/special/bin/mpirun", "-arg1", "-arg2"] # Custom launcher class SiteSpecificLauncher(MainLauncher): def __init__(self): super(SiteSpecificLauncher, self).__init__() def Customize(self): # ---- # Global initialization # ---- if self.sectorname() == "mycluster": paths = self.splitpaths(GETENV("LD_LIBRARY_PATH")) addedpaths = ["/usr/local/compilers/GNU/gcc-4.3.2/lib64"] SETENV("LD_LIBRARY_PATH", self.joinpaths(paths + addedpaths)) def JobSubmitterFactory(self, launch): # Create our own "mpirun" job submitter. if launch == "mpirun": return JobSubmitter_mpirun_custom(self) return super(SiteSpecificLauncher, self).JobSubmitterFactory(launch) # Launcher creation function def createlauncher(): return SiteSpecificLauncher()
The JobSubmitter class is the base class for all job submitters. Each job submitter has 2 key methods:
|Executable()||The Executable() method returns a list containing the command that is used to submit the MPI job and any default command line arguments you might want to provide.|
|CreateCommand()||The CreateCommand() method takes in a tuple of VisIt command line arguments, typically preformatted and ready to run. The CreateCommand() method's job is to reformat the arguments to run them under the specific MPI submission command as well as do any other initialization that is needed. Some job submitters set up extra environment variables or create files with commands to execute. Ultimately, this is the method that produces the command line that is run for the launch of the parallel program.|
Adding a new job submitter
Create a new class derived from JobSubmitter and implement its Executable() and CreateCommand() methods. For the job submitter to be available to the launcher, you must add it to the MainLauncher class' JobSubmitterFactory() method. When the input launcher name matches that for your launcher, return an instance of your new job submitter and the MainLauncher will handle the rest. To use your job submitter, pass -l name to the VisIt script where name is the name of your job submitter.
qsub job submitter
The JobSubmitter_qsub class is currently the most complex subclass of JobSubmitter. The complexity arises from the need to handle variation in qsub command line arguments among different computing sites. Furthermore, the variation among qsubs gives rise to many sites having custom qsub launchers. Another factor in the complexity is that qsub launching often uses a sublauncher to actually run the parallel command and this is typically done from within a shell script. So, the JobSubmitter_qsub class must build up a script that runs the VisIt programs and then submit that script using qsub with possibly different arguments, depending on the computing site.
Customizing sublauncher command
The parallel command needed to run MPI jobs varies according to which sublauncher using used. For example, using a qsub/mpirun launcher will use qsub to submit the job to the batch scheduler while using mpirun within the launch script to actually run the parallel program.
The following sublaunchers are supported with qsub:
- (no sublauncher)
Each sublauncher will have a corresponding method in the JobSubmitter_qsub class that returns a list containing the command and any command line arguments that should be used to run the launcher. In addition, there is another method with the name of the sublauncher, followed by _args that returns a list that builds up the command needed to run the parallel program.
def ibrun(self): return ["ibrun"] def ibrun_args(self, args): mpicmd = self.ibrun() mpicmd = mpicmd + self.VisItExecutable() + args return mpicmd
If you were at a compute site where using ibrun needed to be customized, you could create a subclass of JobSubmitter_qsub and override the ibrun and ibrun_args methods to change how ibrun is used for your system.
class JobSubmitter_qsub_custom(JobSubmitter_qsub): def __init__(self, launcher): super(JobSubmitter_qsub_custom, self).__init__(launcher) def ibrun(self): return ["/my/special/ibrun"] # Use a special path to ibrun def ibrun_args(self, args): mpicmd = self.ibrun() if self.parallel.np != None: # This is what we added mpicmd = mpicmd + ["-np", str(self.parallel.np)] mpicmd = mpicmd + self.VisItExecutable() + args return mpicmd
Custom module loading
Sometimes an HPC environment will need specific modules loaded in order for software to run as expected. This is commonly the case with programs run under a qsub launcher. You can easily override the TFileLoadModules method in your qsub JobSubmitter subclass to make it load the modules that you want in the job script that is ultimately executed under qsub.
class JobSubmitter_qsub_custom(JobSubmitter_qsub): def __init__(self, launcher): super(JobSubmitter_qsub_custom, self).__init__(launcher) def TFileLoadModules(self, tfile): f.write("/etc/profile.d/modules.sh\n") f.write("module rm gcc\n") f.write("module load gcc/4.4.6\n") f.write("module rm openmpi\n") f.write("module load openmpi/1.6.0/gcc/4.4.6\n")
Customizing launch script
The command that gets executed to start parallel jobs is customized via the section above. However, there my be other initialization that you want to add to the script that qsub will execute. You can change how the script is constructed by overriding these methods:
|CreateFilename()||Return the filename that will be used for the script.|
|TFileLoadModules()||Add any module loading to this method. This method is passed an open handle to the script being created so you can file.write() new lines of text to the file. The default implementation just returns.|
|TFileSetup()||Writes setup commands to the file. This method is passed an open handle to the script being created so you can file.write() new lines of text to the file. The default implementation changes the directory and disables core files.|
Customizing qsub command
The qsub command can be highly variable among systems, at least for some command line flags. The qsub command that the JobSubmitter_qsub class creates depends on the following methods:
|SetupPPN()||One of the most variable arguments is the "-l" argument when it is used to specify the number of nodes and processors, as in "-l nodes=2:ppn=2". Since the treatment of these arguments may change from system to system, there is a separate method called SetupPPN() that handles adding these arguments to the qsub command line. You can override the SetupPPN() method on your JobSubmitter_qsub subclass if you need special handling for these arguments.|
|SetupTime()||Time is also handled by a special method called SetupTime() that adds "-l walltime=XX" arguments to the qsub command line. This method can also be overridden if you need to handle time differently.|
|AddEnvironment()||The qsub launcher for VisIt adds certain command line arguments to the qsub command line via a -v argument to qsub. These environment variables are added in the AddEnvironment() method, which you can override.|
|AssembleCommandLine()||If you need full control over how qsub command lines are assembled, you can override the AssembleCommandLine() method.|
The Debugger class is the base class for all debuggers. The internallauncher script currently supports the following debuggers:
The only method that gets called on a debugger object during VisIt's launch is the CreateCommand() method.
|CreateCommand()||Override the CreateCommand() method to construct the arguments that you'll need to start VisIt under a debugger. This method is applied to the visit command line once it is ready to run. This gives your debugger class an opportunity to change the command line that will be executed for VisIt in order to instead start VisIt under a debugger.|
Adding a new debugger
To add a new debugger, do the following:
- Create a new derived class of Debugger, overriding the CreateCommand() method.
- Add a new debugger name to MainLauncher.Debuggers() list of debugger names. This helps VisIt parse the debugger arguments.
- Make the MainLauncher.DebuggerFactory() method return your derived class when the appropriate debugger name is passed.
The MainLauncher class contains all of the methods to launch the standard VisIt programs. It makes use of various JobSubmitter classes to launch MPI jobs and it uses various Debugger classes to launch VisIt in a debugger.
The MainLauncher class uses 3 helper classes to parse command line arguments. These classes take command line argument values and set class members. The MainLauncher class contains an instance of each class that is used to contain the state for the various types of arguments. This state can be passed around to job submitters and debugger classes so global variables are not used. In addition to gathering command line arguments, the helper classes also are responsible for adding their particular state to the command line that is eventually used to launch VisIt programs.
The classes are:
- GeneralArguments - most command line arguments fit here
- ParallelArguments - parallel-related command line arguments
- DebugArguments - debugger-related command line arguments
The helper classes each provide 2 methods to the MainLauncher:
|ParseArguments()||Set object state based on a command line argument|
|ProduceArguments()||Produce a list of command line arguments based on object state|
These are some helper functions that are borrowed from frontendlauncher. They are used but are not part of the MainLauncher class.
|exit(msg, code)||Exit the internallauncher with a message and a return code|
|GETENV(var)||Return a string containing the requested environment variable|
|SETENV(var, value)||Set an environment variable to the specified value|
Rather than call various forms of command line utilties back-ticked and passed through "tr" and so on, the new launch system provides helper methods for common tasks such as getting the hostname, splitting paths, etc.
|username()||Return the user name.|
|hostname()||Return the full host name: edge83.llnl.gov|
|nodename()||Return the node name: edge83|
|sectorname()||Return the sector name: edge|
|domainname()||Return the domain name: llnl.gov|
|uname()||Return the OS name|
|splitpaths()||Split paths separated by ':' and return a unique list|
|joinpaths(paths)||Join a list of paths into a string separated by ':'|
|quoted(s)||Return a string surrounded by quotes if the string contains spaces|
|writepermission(path)||Return True if the path has write permission; False otherwise|
|call(args, stdinpipe)||Call a process given by args, which is a list of command line arguments. If stdinpipe is true then make a pipe for stdin.|
|message(msg, file)||Call when the launcher should issue a message.|
|warning(msg)||Call when the launcher should issue a warning message.|
|error(msg)||Call when the launcher should issue an error message.|
|iscomponent(name)||Return true if name is a VisIt component.|
Methods that are part of a launch
In order to provide derived launcher classes to have more control over the launch process, the launch process is divided into different methods.
The main methods that most launcher subclasses will override are: Customize() and JobSubmitterFactory().
|Customize()||Method to let derived classes perform top-level initialization|
|JobSubmitterFactory()||Method that creates an instance of a JobSubmitter class to handle a given parallel launcher|
Here are some other methods that can be overridden. The methods are given in the order that the internallauncher function calls them.
|Initialize()||Initialize some important members of the object.|
|SetLogging()||Set whether logging of executables is enabled.|
|ParseArguments()||Read the command line arguments into different objects (general, parallel, debugger).|
|DetermineArchitecture()||Determine a list of supported VisIt architecture strings.|
|ConsistencyCheck()||Check the different command line arguments for consistency and alter their values if needed.|
|Customize()||Method that lets derived classes do top-level custom initialization.|
|UpdateExecutableNames()||Update the name for the executable that we'll run. This usually only affects the engine, transforming engine to engine_ser or engine_par.|
|SetupDirectoryNames()||Set up various directory names that can be used to reference different parts of the VisIt installation.|
|SetupEnvironment()||Set up the environment variables that we need to run VisIt.|
|MakeUserDirectories()||Make the user's ~/.visit plugin subdirectories.|
|PrintUsage()||Print the usage (must be called after SetupDirectoryNames)|
|PrintEnvironmentShell()||Print shell commands to replicate the environment that VisIt has set up.|
|Launch()||Launch the VisIt executable once all other preparations have been made. This method creates a command line to execute and then passes it to the call() method. The program's return value is returned from this method.|
Differences from older versions
Since the entire system of launching scripts was rewritten in Python, there are some important differences above and beyond what changes were made in the implementation:
Where are my hacks?
The old internallauncher script contained a lot of machine-specific hacks. Those hacks have been moved into customlauncher files for various sites. These can be found in the src/resources/hosts/<site> directory where <site> is the name of a computing site.
If you are testing on a cluster and working within a source directory, you can enable your site customizations again by copying the customlauncher file into the src/bin directory.
The order of command line arguments was somewhat preserved in the original launchers. The new launcher does not preserve command line argument ordering.
The original launchers printed a command string to the console that did not necessarily match the command that was actually being executed. The launch command for running jobs under msub was a good example since it would print the cat'd contents of the constructed launch script as part of the printed command line rather than give the name of the script that contained the commands.
The new launcher prints what it executes (with the caveat that -key token arguments are filtered out as before).
The output for -norun in the new launcher is formatted such that environment variables come first, followed by the VisIt command to run. This lets you paste all of the commands in order into a command line shell.
Use of loopback interface
The old launch system often intended to use the local machine's loopback interface 127.0.0.1 but it frequently did not do so. The new script is more aggressive about using 127.0.0.1 as the host for local process launches unless -noloopback is passed.
- The shell portion of the new frontendlauncher unsets the PYTHONHOME environment variable.
- The new internallauncher then sets PYTHONHOME to point to VisIt's Python modules.
The reason for #2 is so we can transplant the VisIt CLI on to systems other than where it was built. Setting the PYTHONHOME variable is necessary to make the CLI find the Python modules.
The reason for #1 is that setting PYTHONHOME as in #2 causes the system Python to not be able to locate its own modules. The VisIt launchers are run using the system Python, which may be incompatible with VisIt's Python.
The old launcher had specific arguments to launch components under gdb (e.g. -gdb-engine, -gdb-mdserver, -gdb-viewer, -gdb-gui). These arguments were removed in the new launcher to conform to the style used by the other debuggers such as totalview. The general pattern is: -debuggername [debugger args] component.
To run the engine under GDB in a new window:
visit -gdb engine_ser -xterm
Hardware pre/post arguments
Both the new and old launch scripts have support for -hw-pre and -hw-post arguments. These arguments are intended to be commands to start up X servers and tear them down, though the commands could really be anything. The old version of internallauncher constructed commands oddly and would often have a sublauncher run the hw-pre and hw-post commands as part of the compute engine command line being run. This lead to command scripts that seem like they could not possibly work as intended:
#!/bin/sh cd somewhere ulimit -c 0 srun -n 8 startx /path/to/bin/engine_par -host 127.0.0.1 -port 5600 -norun engine_par -noloopback stopx
Of course, srun could launch the startx command but subsequent command line arguments would be command line arguments to startx and it does not seem that the correct command sequence would be run.
The new internallauncher will create a command script that looks like this:
#!/bin/sh cd somewhere ulimit -c 0 startx srun -n 8 /path/to/bin/engine_par -host 127.0.0.1 -port 5600 -norun engine_par -noloopback stopx