OmniComp is an intuitive system that is easy to install and use which allows to accelerate Monte Carlo programs through distributed execution on workstation clusters or PC farms as well as multi- processor machines. It is based on omniORB, a high-performance, open-source CORBA implementation from AT&T Laboratories Cambridge. Notably, it does not require thread-safe integrand implementations.
Usage examples:
host3> ssh -n host1 /path1/omnicomp_prog -w & host3> ssh -n host2 /path2/omnicomp_prog -w & host3> ssh -n host2 /path2/omnicomp_prog -w & host3> /path3/omnicomp_prog master process started (includes 1 worker) 3 additional worker(s) found work done so far ... 0% 0% 0% 0%
Here, host1 and the local host are assumed to be single-processor machines, while host2 is assumed to be a dual-processor system. The working directory with the executable file omnicomp_prog is assumed to be shared between host1, host2 and the local host. This is convenient but not necessary (see -n option below). If ssh asks for a password on the command line try ssh -f instead of ssh -n.
Since the master process contains its own worker one can also start just one instance of the executable with no options and it will run like a regular, non-distributed program:
host> /path/omnicomp_prog master process started (includes 1 worker) no additional workers found work done so far ... 0%
If different executables are required on different hosts,
for example because they run different operating systems,
the executables can be distinguished with a ".<key>" postfix.
One needs to use the -d option in this case. It causes the
postfix to be disregarded when the *.workers
file name is
determined:
host3> ssh -n host1 /path1/omnicomp_prog.os1 -w -d & host3> ssh -n host2 /path2/omnicomp_prog.os2 -w -d & host3> ssh -n host2 /path2/omnicomp_prog.os2 -w -d & host3> /path3/omnicomp_prog.os3 -d
One can disable the collocated worker in the local master process with the -m option. In this mode the master process controls the other workers, monitors progress and collects results, but does not participate in the computation itself.
If no shared directory mounted on all machines is available the
distributed computation can be bootstrapped using the naming
service (-n option). After setting up the environment for naming
service as described in section 3.2.2, the server needs to be
started on the host specified in omniORB.cfg
by executing omniNames.
Then, for example for tcsh:
host4> ssh -n host1 "tcsh -c '/path1/omnicomp_prog -w -n'" & host4> ssh -n host2 "tcsh -c /path2/omnicomp_prog -w -n'" & host4> ssh -n host3 "tcsh -c /path3/omnicomp_prog -w -n'" & host4> /path4/omnicomp_prog -n
If your remote shell account is not set up to access the naming service you can also include the information on the command line:
host4> ssh -n host1 /path1/omnicomp_prog -w -n -ORBInitRef NameService=corbaname::names.example.edu & host4> ssh -n host2 /path2/omnicomp_prog -w -n -ORBInitRef NameService=corbaname::names.example.edu & host4> ssh -n host3 /path3/omnicomp_prog -w -n -ORBInitRef NameService=corbaname::names.example.edu & host4> /path4/omnicomp_prog -n
Here, names.example.edu
is the hostname of the system that provides
the naming service.
It is important to use the same set of omnicomp options (except for -w and -m) in all commands else errors will likely occur. If that happens the naming service can be cleaned up with nameclt, an omniORB client program to inspect and modify the naming service registry.
In order to build OmniComp executables one needs to link with
omniORB libraries. These have been pre-built for a number of
common platforms including Intel/Linux and can be downloaded
for free at
http://www.uk.research.att.com/omniORB/omniORBForm.html
If no pre-built libraries are available for your platform they can easily be built from source with a few steps:
./mk/platforms/
and uncomment the
corresponding line in ./config/config.mk
, for example:
platform = alpha_osf1_5.0
./mk/platforms/
edit the line that sets PYTHON
and insert the path
to your python interpreter, e.g. /usr/local/bin/python
.
If Python 1.5.2 or higher is not installed on your system
follow the instructions in the file.
./src
and make export (ignore the warnings)
This step requires about 90MB and takes about 40 minutes
on a PentiumII/333MHz.
# omniorb libraries and executables setenv OMNIORB_TOPDIR ${HOME}/omniorb/omni setenv LD_LIBRARY_PATH ${LD_LIBRARY_PATH}:${OMNIORB_TOPDIR}/lib/i586_linux_2.0_glibc2.1 setenv PATH ${PATH}:${OMNIORB_TOPDIR}/bin/i586_linux_2.0_glibc2.1
If your omnicomp programs will be located in a shared directory that is mounted on all computers no further steps are necessary.
Otherwise the computation has to be bootstrapped with omniNames (option -n) and one also needs:
# omniorb naming service (omniNames) setenv OMNINAMES_LOGDIR ${HOME}/omniorb/names_log setenv OMNIORB_CONFIG ${HOME}/omniorb/omniORB.cfg
OMNINAMES_LOGDIR
specifies the log directory for omniNames and is
only required in the shell that is used to start omniNames. The
log directory and files are created when omniNames is started for
the first time.
File omniORB.cfg
indicates on which computer omniNames will be
run. Example with default port number:
ORBInitialHost names.example.edu ORBInitialPort 2809