Valgrind

Valgrind is a Linux tool often for finding memory errors and leaks. In fact, it is a suite that contains tools for profiling and other kinds of debugging, but this page primarily discusses to debugging memory errors.

Valgrind's home page is located at http://valgrind.org/. It supports x86, x86-64, and to some degree PPC versions of Linux. Note that the name Valgrind is pronounced with a short "i"; the word is from Norse mythology.

Running VisIt under Valgrind

Valgrind is a runtime debugging tool, so one can compare its usage to that of a debugger like GDB: to use it, it is not necessary to build your code any differently, other than the fact that it is eminently more useful if you include debugging symbols in your binary before running. This means it is most useful when you are debugging a version of the code you built from soruce.

Typical usage of Valgrind for most programs is to simply prepend your command line with "valgrind". However, because VisIt comprises several processes with some environment variables needed for proper operation, it is often simplest to use command-line options to the visit program to launch the desired component under Valgrind.

For example, the following commands will launch the various components under Valgrind:

  • visit -valgrind gui
  • visit -valgrind viewer
  • visit -valgrind mdserver
  • visit -valgrind engine_ser
  • visit -valgrind engine_par

Useful valgrind arguments

You can pass additional arguments to valgrind to control its behavior. You need only add the valgrind arguments after -valgrind and before the name of the VisIt program being run under valgrind.

visit -valgrind <extra arguments> engine_ser

Making valgrind show up in a new window:

visit -valgrind engine_ser -xterm

Making valgrind output go to a log file:

visit -valgrind --log-file=engine-valgrind.log engine_ser

Making valgrind record memory leaks:

visit -valgrind --leak-check=full engine_ser

Making valgrind write its output to a log file:

visit -valgrind --log-file=output.log engine_ser

Known warnings which can be ignored/suppressed

Some of the libraries to which VisIt links can cause some spurious warnings which can be safely ignored or suppressed:

  • Source and destination overlap in strcpy, in lite_PD_ls in libESiloDatabase_ser.so
  • Source and destination overlap in strcpy, in lite_SC_firsttok in libESiloDatabase_ser.so
  • Syscall param write() points to uninitialised byte(s) in /usr/X11R6/lib64/libX11.so
  • Syscall param writev() points to uninitialised byte(s) in /usr/X11R6/lib64/libX11.so
  • Syscall param socketcall.sendto() points to uninitialised byte(s) in SocketConnection::Flush in SocketConnection.C
  • Conditional jump or move depends on uninitialised value(s) in AttributeGroup::WriteType in AttributeGroup.C

Running valgrind under the test suite

Sometimes it is useful to check whether a case in the test suite causes memory errors in VisIt. Fortunately, it is possible to run the engine under valgrind from within the test suite.

./runtest -e /path/to/visit/bin/visit --vargs "-valgind --log-file=/path/to/valgrind_results.log engine_ser"

Parallel

Valgrind can be run on the compute engine even when running in parallel. To do this usually requires that the test suite be told how to launch parallel jobs and that a parallel testing mode is selected. Note that we tell valgrind to place all of its output in a log file called valgrind_results.log to make sure that we don't need to rely on the test suite's text output. Note that all lines in the valgrind log will be prepended by a string containing the pid of the engine_par rank that caused the valgrind error.

./runtest -e /path/to/visit/bin/visit --vargs "-valgind --log-file=/path/to/valgrind_results.log engine_par" \
          --parallel-launch srun -m scalable,parallel,icet