SST/macro
Call Graph Visualization

Call Graph Visualization

Generating call graphs requires a special build of SST/macro.

1 build$ ../configure --prefix=$INSTALL_PATH --enable-graphviz

The –enable-graphviz flag defines an instrumentation macro throughout the SST/macro code. This instrumentation must be compiled into SST/macro. In the default build, the instrumentation is not added since the instrumentation has a high overhead. However, SST/macro only instruments a select group of the most important functions so the overhead should only be 10-50%. After installing the instrumented version of SST/macro, a call graph is collected by adding a simple boolean to the parameter file.

1 call_graph = true

After running, a callgrind.out file should appear in the folder.

To visualize the call graph, you must download KCachegrind: http://kcachegrind.sourceforge.net/html/Download.html. KCachegrind is built on the KDE environment, which is simple to build for Linux but can be very tedious for Mac. The download also includes a QCachegrind subfolder, providing the same functionality built on top of Qt. This is highly recommended for Mac users.


gui.png

Figure 13: QCachegrind GUI



The basic QCachegrind GUI is shown in Figure 13. On the left, a sidebar contains the list of all functions instrumented with the percent of total execution time spent in the function. In the center pane, the call graph is shown. To navigate the call graph, a small window in the bottom right corner can be used to change the view pane. Zooming into one region (Figure 14), we see a set of MPI functions (Barrier, Scan, Allgatherv). Each of the functions enters a polling loop, which dominates the total execution time. A small portion of the polling loop calls the "Handle Socket Header'' function. Double-clicking this node unrolls more details in the call graph (Figure 15). Here we see the function splits execution time between buffering messages (memcpy) and posting headers (Compute Time).


callgraph1.png

Figure 14: QCachegrind Call Graph of MPI Functions




callgraph2.png

Figure 15: QCachegrind Expanded Call Graph of Eager 0 Function