How to use Charm++'s Projections
This page explains how to use Charm++'s performance analysis tool, Projections with Xyst.
How to analyze Xyst performance with Charm++'s Projections tool
To enable performance analysis of Xyst with Charm++ do
git clone https://codeberg.org/xyst/xyst.git && cd xyst mkdir build && cd build cmake -DCMAKE_C_COMPILER=mpicc -DCMAKE_CXX_COMPILER=mpicxx -GNinja -DCHARM_OPTS="-DTRACING=true -DTRACING_COMMTHREAD=true" -Wno-dev -DRUNNER_ARGS="--bind-to none -oversubscribe" -DPOSTFIX_RUNNER_ARGS=+setcpuaffinity -DEXTRA_LINK_ARGS="-tracemode projections" ../src ninja
The above will build Charm++ enabling performance tracing and will pass an extra link argument to Xyst executables. This instructs Charm++ to produce information about all Charm++ events, e.g., entry method calls and message packing, during the execution of Xyst executables.
Once the above went fine, performance can be analyzed by first collecting some data:
./charmrun +p32 Main/inciter -i ../../tmp/problems/sedov/sedov01.exo -c ../../tmp/problems/sedov/sedov_riecg.q Running as 32 OS processes: Main/inciter -i ../../tmp/problems/sedov/sedov01.exo -c ../../tmp/problems/sedov/sedov_riecg.q charmrun> /usr/bin/setarch x86_64 -R mpirun -np 32 Main/inciter -i ../../tmp/problems/sedov/sedov01.exo -c ../../tmp/problems/sedov/sedov_riecg.q Charm++> Running on MPI version: 3.1 Charm++> level of thread support used: MPI_THREAD_SINGLE (desired: MPI_THREAD_SINGLE) Charm++> Running in non-SMP mode: 32 processes (PEs) Converse/Charm++ Commit ID: fa84486 Charm++: Tracemode Projections enabled. Trace: traceroot: Main/inciter Isomalloc> Synchronized global address space. CharmLB> Load balancer assumes all CPUs are same. Xyst> Load balancing off Charm++> Running on 1 hosts (2 sockets x 16 cores x 1 PUs = 32-way SMP) Charm++> cpu topology info is gathered in 0.002 seconds. ...
This run will produce log and sts files in the build folder where the executable resides. Projections can then be used to analyze performance data in detail. Example screenshots are displayed below.