SST/macro
|
SST/macro is available at http://bitbucket.org/sst-ca/sstmacro. You can get SST/macro in the following ways:
Clone the repository with Mercurial.
If you're using Mercurial, you can run the command:
The -r default
only downloads the current development branch and can be omitted if you want to bring in the entire history. The download can take a very long time on some systems to generate the "deltas" in the revision history. You can save yourself a lot of waiting by only downloading the default revision. If you're behind a firewall, make sure the http proxy is set in your ~/.hgrc file:
If you'd like to use ssh for convenience, you'll have to modify your clone slightly by adding the "hg" username:
and also add your public key to your bitbucket user account. Also, SST/macro uses subrepos, so for using ssh you should add the following to your ~/.hgrc
so that the http requests are converted to ssh.
Libtool: 2.2.6 or later should work and 2.4 is known to work
(optional) Python with the argparse module installed is required to run UPC skeletons. Python 2.7 and on should have this.
For a list of known compatible systems, see in the PDF manual.
Once SST/macro is extracted to a directory, we recommend the following as a baseline configuration, including building outside the source tree:
A complete list of options can be seen by running `../configure –help'. Some common options:
–enable-mpiparallel: Enable parallel discrete event simulation in distributed memory over MPI. See Section Parallel Simulations.
Once configuration has completed, printing a summary of the things it found, simply type `make'. It is recommended to use the `-j' option for a parallel build with as many cores as you have (otherwise it will take quite a while).
If the build did not succeed, check Known Issues for known issues, or contact SST/macro support for help (sstma). cro- suppo rt@g oogle grou ps.co m
If the build was successful, it is recommended to run the range of tests to make sure nothing went wrong. To do this, and also install SST/macro to the install path specified during installation, run the following commands:
Make check runs all the tests we use for development, which checks all functionality of the simulator. Make installcheck compiles some of the skeletons that come with SST/macro, linking against the installation.
Important: After SST/macro is installed, add /path-to-install/bin to your PATH variable (we keep it in our .bashrc, .profile, etc). Applications and other code linking to SST/macro use Makefiles that use the sst++ compiler wrapper that is installed there for convenience to figure out where headers and libraries are. If you are building a skeleton (or running make installcheck) and you get errors along the lines of "can't find sst++", you probably forgot this step.
make -j: When doing a parallel compile dependency problems can occur. There are a lot of inter-related libraries and files. Sometimes the Makefile dependency tracker gets ahead of itself and you will get errors about missing libraries and header files. If this occurs, restart the compilation. If the error vanishes, it was a parallel dependency problem. If the error persists, then it's a real bug.
Ubuntu: The Ubuntu linker performs too much optimization on dynamically linked executables. Some call it a feature. I call it a bug. In the process it throws away symbols it actually needs later. This occurs when the executable depends on libA which depends on libB. The executable has no direct dependence on any symbols in libB. Even if you add -lB
to the LDFLAGS
or LDADD
variables, the linker ignores them and throws the library out. It takes a dirty hack to force the linkage. If you get weird undefined reference errors, these can usually be removed by including the header file sstmac/force_link.h
. If there are issues, contact the developers at sstma and report the problem. It can be fixed easily enough. cro- devel @goo glegr oups .com
Compilation with clang should work, although the compiler is very sensitive. In particular, template code which is correct and compiles on several other platforms can mysteriously fail. Tread with caution.
By default, DUMPI is configured and built along with SST/macro with support for reading and parsing DUMPI traces, known as libundumpi. DUMPI binaries and libraries are also installed along with everything for SST/macro during make install. DUMPI can be used as it's own library within the SST/macro source tree by changing to sstmacro/dumpi, where you can change its configuration options. It is not recommended to disable libundumpi support, which wouldn't make much sense anyway.
DUMPI can also be used as stand-alone tool/library if you wish (e.g.~for simplicity if you're only tracing). To get DUMPI by itself, either copy the sstmacro/dumpi directory somewhere else or visit bitbucket.org/sst-ca/dumpi and follow similar instructions for obtaining SST/macro.
To see a list of configuration options for DUMPI, run `./configure –help'. If you're trying to configure DUMPI for trace collection, use `–enable-libdumpi'. Your build process might look like this (if you're building in a separate directory from the dumpi source tree) :
Warning: It is possible that the configuration process for DUMPI can take a very long time on network file systems. It basically runs through every MPI function to check its availability/status on your system. If the MPI headers and libraries are not locally available (hard drive) or cached locally, then they will be brought in each time these tests are compiled and run from storage. If storage is a parallel file system, it might be slow.
When compiling on platforms with compiler/linker wrappers, e.g. ftn (Fortran) and CC (C++) compilers at NERSC, the libtool configuration can get corrupted. The linker flags automatically added by the wrapper produce bad values for the predeps/postdeps variable in the libtool script in the top level source folder. When this occurs, the (unfortunately) easiest way to fix this is to manually modify the libtool script. Search for predeps/postdeps and set the values to empty. This will clear all the erroneous linker flags. The compilation/linkage should still work since all necessary flags are set by the wrappers.
The GUI depends on Qt 5.0 or greater. These can be easily downloaded from the Qt website. To configure SST/macro for compiling the GUI, an additional flag must be added:
The variable $QMAKE must point to the qmake executable. If qmake is in $PATH, only `–with-qt' needs to be added. The GUI is compiled independently from SST/macro. In the build directory, just invoke:
You will see the Qt compilation output followed by output from the source code parser. Keyword input to the GUI is automatically generated from the source code. Once parsing is complete, the GUI is ready to use. The executable is found in the qt-qui folder. On Mac, an application is generated, which can be run:
On linux, a simple executable is generated.
To demonstrate how an application is run in SST/macro, we'll use a very simple send-recv program located in sstmacro/tutorials/sendrecv_c. We will take a closer look at the actual code in Section Basic MPI Program. After SST/macro has been installed and your PATH variable set correctly, run:
You should see some output that tells you 1) the estimated total (simulated) runtime of the simulation, and 2) the wall-time that it took for the simulation to run. Both of these numbers should be small since it's a trivial program.
This is how simulations generally work in SST/macro: you build skeleton code and link it with the simulator to produce a binary. Then you run that binary and pass it a parameter file which describes the machine model to use.
We recommend structuring the Makefile for your project like the one seen in tutorials/sendrecv_c/Makefile :
The sstmacro-config script is built by the SST/macro configuration process and installed into the bin folder. More linker and include flags can be added for different source trees. For advanced usage in projects built with automate and autoconf, the sstmacro-config script can be invoked in configure.ac following the usage above.
The three `sendrecv' skeletons in sstmacro/tutorials show the different usage of C and C++ linking against SST/macro: C, C++ but with a C-style main, and a C++ class that inherits from sstmac::sw::mpiapp. Using C++ inheritance (such as in the sendrecv_cxx2 folder) will give you the most flexibility, including the ability to run more than one named application in a single simulation (see Section SST/macro Parameter files for more info).
There are only a few basic command-line arguments you'll ever need to use with SST/macro, listed below
-r [run number] - for a parameter file that is enabled for a parameter sweep, run only a specific parameter combination. See Section Parameter Sweeping with fork().
SST/macro supports running a parallel discrete event simulation (PDES) in distributed memory over MPI. First, you must configure the simulator with the `–enable-mpiparallel' flag. Configure will check for MPI and ensure that you're using the standard MPI compilers. Your configure should look something like:
SST/macro also requires METIS for partitioning the workload amongst parallel processes, although in future versions this may no longer be a strict dependency. Make sure you have that installed gpmetis
somewhere and the binary is in your PATH. METIS can be found at http://glaros.dtc.umn.edu/gkhome/metis/metis/download. SST/macro is run exactly like the serial version, but is spawned like any other MPI parallel program. Use your favorite MPI launcher to run, e.g. for OpenMPI
or for MPICH
A few special configuration options are needed:
SSTMAC_PARALLEL=mpi
. By default, SST/macro assumes SSTMAC_PARALLEL=serial
. runtime = mpi
to the input file. Again, by default, SST/macro assumes runtime = serial
. Add event_manager = clock_cycle_parallel
. By default, SST/macro uses a serial event map. Currently, only one form of parallelism is supported - conservative, clock-cycle parallelism. More types of parallel event managers may be available in future versions, but this is the only currently supported one. If running on a single processor, clock_cycle_parallel
is functionally equivalent to a serial event map.
Even if you compile for MPI parallelism, the code can still be run in parallel with the same configuration options. SST/macro will notice the total number of ranks is 1 and ignore any parallel options.
When launched with multiple MPI ranks, SST/macro will automatically figure out how many partitions (MPI processes) you are using, partition the network topology into contiguous blocks, and start running in parallel. If running an MPI program, you should probably be safe and use the `mpicheck' debug flag to ensure the simulation runs to completion. The mpicheck ensures MPI_Finalize
is called and the simulation did not "deadlock.'' While the PDES implementation should be stable, it's best to treat it as Beta++ to ensure program correctness.
Parallel simulation may not speed up SST/macro for certain test cases. Most events are scheduled farther into the future than link (synchronization) latency. Since we use the conservative null-message technique, there will be a lot of overhead in synchronizing the clocks. Parallel is most likely to be useful because of memory constraints, expanding the maximum memory footprint. For simulations with serious congestion or heavy interconnect traffic, you may observe speedups, but they will be far from ideal or linear given the synchronization overheads.