Gdb

GDB stands for GNU Project Debugger (GDB home) and gdb is installed on many Linux (and other OS's) platforms in conjunction with the gcc compiler. Due to its wide availability (it's free) on Linux and other platforms, using gdb to debug VisIt is a good option.

Running VisIt under GDB

For most programs, it is sufficient to just run gdb program at the command prompt to load a program into gdb and then a simple run command would start the program. VisIt is composed of several components that talk to one another over sockets and those components use shared libraries, whose LD_LIBRARY_PATH is set up by the visit script that launches the various components. In order to simplify launching of VisIt components under gdb, VisIt's launcher script has been enhanced over the years to support launching of VisIt components under gdb such that all of the requisite environment variables are properly set up without the VisIt developer having to know a lot of extra witchcraft.

Note -- In order to run VisIt under gdb, you must have built VisIt from source yourself because all debugging symbols are removed from the binary VisIt distributions.

GDB-related command line arguments

VisIt's launch script takes care of the extra tedium of running under GDB by setting up environment variables, telling GDB where to find the VisIt source code, etc. See the table below for the arguments.

Command Line Argument Description
-gdb gui Run the gui under gdb.
-gdb cli Run the cli under gdb.
-gdb viewer Run the viewer under gdb.
-gdb mdserver Run the mdserver under gdb.
-gdb engine_ser Run the serial compute engine under gdb.
-break <funcname> Add the specified breakpoint in GDB. Multiple -break arguments can be passed on the VisIt command line. The value for <funcname> is a C++ function name of class-qualified method name from the program being run. Note that in general you cannot set breakpoints directly in plugin methods; you must first break in another method that is part of the program being debugged. Once the plugins have been loaded, you can set the name of a plugin breakpoint.
-xterm Run gdb in a new xterm window. This can be helpful so your gdb session does not have to share the console that launched VisIt.


Example: Run the gui and break when the code enters the QvisGUIApplication constructor, which lets us stop before we initialize the GUI's most important object.

visit -debug 5 -gdb gui -xterm -break QvisGUIApplication::QvisGUIApplication

Useful GDB commands

GDB is a command line tool so you will need to know at least a handful of commands to get by. Often using the first letter of a command is enough to identify the command to gdb.

Command Description
run Runs the program that has been loaded into GDB. The GDB-script created by VisIt usually does this for you.
quit Quits gdb
break Sets a breakpoint. Example: break my_function_name
where Prints out a call stack, which is the path that the program took through the code to where it is when you hit a breakpoint or when the program crashed. When the program crashes inside of the gdb debugger, typing where can be very useful because it helps you isolate where your code is failing. In most cases, the call stack includes the function name and a line number.
c Continue running. This is useful when you want to resume running the program after hitting a break point and single-stepping for a while.
n Execute the program until the next line of code.
s Step into a function call
p Print the value of a variable from the code. Example: p foo (assuming your function has a variable called foo)
up Unwind the stack one frame. You can keep using up to go all the way back up to the main function.

How to debug...

This section provides a little more concrete examples for how to debug certain common cases in gdb.

A VisIt component is crashing

It can be very frustrating when one of your changes causes one of VisIt's components to crash. Generally, when a VisIt component crashes, you will be notified which one crashed either in a window or in a message on the command line. Of course, if you are running with "-debug 5" command line arguments then you will have a set of debug logs to check. Depending on the nature of the crash, the debug logs of the failed component will often contain signal handling information related to the crash such as SEGV for a segmentation violation, which happens with bad pointers.

Once you have determined the failed component, run it inside of the GDB debugger using VisIt's gdb command line arguments. Example: the viewer is crashing...

visit -debug 5 -gdb viewer -xterm
  1. Go through the steps that are needed to happen to make the viewer crash. These could be scripted in Python if you'll need to do them a many times.
  2. Once the viewer crashes in the debugger, type where at the GDB prompt to print the call stack.
  3. If the most recent item on the call stack is in your code then you can take the function name and line number of the failure and open up your source and look for possible flaws.
  4. If the most recent item on the call stack is down in some library that you call then you can unwind the stack by typing the up command until you are in a stack frame containing code that you own. Look for bad arguments being passed to the library function that you're calling.
  5. You can insert printf statements or debug streams into your code to print out values. Recompile and run again. Alternatively, you could run again and set a breakpoint at the start of the failed function and run again.
  6. Assuming that you've set a breakpoint and run again, use a combination of n, s commands to advance through the code to the point where your program last crashed. You could also set another breakpoint interactively from within gdb using the line number of the place where your program last crashed (e.g. "break 2314" to break on line 2314 of the current source file).
  7. Don't forget to print values along the way using the p command. Printing values interactively while the program is being run one line at a time will often let you find the problem that caused your program to crash (for example, a NULL pointer).

Debugging database plugins

If your database plugin is crashing the mdserver or the compute engine then you can probably just debug as you would for any VisIt component that is crashing (described above). However, if your plugin is not returning the right data or if it is not reading your file correctly then you may want to step through it using gdb. As was mentinoed before, it is not generally possible to set breakpoints for plugin method names until the plugins are actually loaded. Plugins are loaded at different times on the mdserver and the compute engine. This means that you'll have to employ slightly a different technique for each.

In the mdserver

The mdserver does not call all database plugin methods. Typically, you can bet that many of them will be called except for the GetMesh, GetVar, GetVectorVar methods. Common errors are in the database plugin constructor and the PopulateDatabaseMetaData method.

visit -debug 5 -gdb mdserver -break MDServerConnection::GetDatabase
  1. Go through the steps needed to open your file. This will cause gdb to hit the breakpoint that you set on the command line.
  2. Set a new breakpoint for a method in your plugin. break funcname
  3. c to continue until execution arrives at your new breakpoint
  4. n, s to execute lines of code

In the compute engine

The compute engine calls all database plugin methods at one time or another.

visit -debug 5 -gdb engine_ser -break NetworkManager::EndNetwork -xterm
  1. Go through the steps needed to open your file. This will cause gdb to hit the breakpoint that you set on the command line.
  2. Set a new breakpoint for a method in your plugin. break funcname
  3. c to continue until execution arrives at your new breakpoint
  4. n, s to execute lines of code

Parallel

For parallel debugging of the compute engine using gdb, it is best to run normally and then attach to a particular rank using gdb. Once you have attached to the rank that you want to debug, run VisIt until you can get the crash that you want to examine. The rank that you are debugging will provide a stack in gdb that may shed light on the nature of the failure.

Parallel Simulations

Simulations that crash can often be debugged by attaching with a debugger. However, that is not possible sometimes if the simulation crashes soon after startup or during startup. To deal with that situation, you can run the simulation in gdb, spawned by MPI. This way, the offending rank will be caught in one of the xterm windows and you can get a stack trace.

# Copy the gdb run command to a command file.
echo "run -dir ../../.. -groupsize 2 -trace foo -maxcycles 3 -debug" > commands.txt

# Run the simulation each rank of the simulation under GDB in a different xterm window
mpirun -np 4 xterm -e gdb batch_par -x commands.txt

ParallelGDBInXterms.png