Building on BlueGeneP

This page is obsolete but has been retained for historical purposes.

In June 2009, VisIt 1.12 was ported to the BlueGene/P platform to perform scaling experiments at massive scale. BlueGene/P conists of PowerPC login nodes running Linux and back-end compute nodes running a limited version of Linux called "Compute Node Kernel" (CNK). Programs that must run on the compute nodes must be cross-compiled on the login nodes using special "bg" compilers. Since VisIt is made of several cooperative programs, some programs run on the login nodes while the parallel compute engine must run on the compute nodes. This means that pieces of VisIt must be compiled for what are essentially different platforms.

Building on the login nodes is straightforward and the build_visit script was enhanced to build a Linux/PowerPC version of VisIt. Building for the compute nodes requires a lot of special hand-holding because VisIt's 3rd party libraries must be built in non-standard configurations. For example, the compute nodes have no X11 library so building with Qt is out of the question. This means that VisIt's gui and viewer cannot be built. VisIt's build system was enhanced to provide an engine-only build that builds only support libraries, the compute engine, and its plugins. This reduced VisIt build is used to build the pieces of VisIt needed for the BlueGene/P compute nodes.

  • VisIt 1.12 was the only version of VisIt to be built on BlueGene/P. Since then VisIt's build system has been replaced with cmake.
  • Code produced by xlC had unresolved runtime errors
    • There was no time to figure out the root cause at the time so g++/gcc were used to build all of the code
    • Advanced tuning/hardware features were not explored

Building Mesa

Mesa support for building on BlueGene/P was VERY poor at the time. I don't know if it has improved. VisIt was using Mesa 5.0 but I had to use 6.4.2 with some extra changes to remove all X11 dependencies. I ended up building Mesa by hand. It is important to note that we do not create a mangled Mesa as we do in all other cases when building VisIt. We don't mangle because we use Mesa as OpenGL in the VTK build since BlueGene/P does not have OpenGL.

Edit the configs/linux file.

1. Replace the compilers with:
CC = /bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/gcc
CXX = /bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/g++
2. Set X11_INCLUDES to nothing
3. Comment out EXTRA_LIB_PATH
4. Add this to the end of the file to override defaults:
SRC_DIRS = mesa glu 
DRIVER_DIRS =osmesa
GL_LIB_DEPS = $(EXTRA_LIB_PATH) -lm -lpthread
OSMESA_LIB_DEPS = 

I also edited src/mesa/Makefile and changed the definition of STAND_ALONE_SOURCES to:

STAND_ALONE_SOURCES = \
	$(CORE_SOURCES) \
	$(ASM_SOURCES)

Then, I edited line 165 so includes $(COMMON_DRIVER_OBJECTS) $(OSMESA_DRIVER_OBJECTS) instead of using $(OSMESA16_OBJECTS). This is needed to keep the osmesa library small it does not include the rest of Mesa.

Building VTK

There were special steps to get VTK to build.

  • We're going to replace vtkXOpenGLRenderWindow with a special OSMesa version of vtkXOpenGLRenderWindow that does offscreen OpenGL stuff using Mesa. Copy ~/Development/vtkXOpenGLRenderWindow* into VTK's Rendering directory.
  • We're not going to need vtkXRenderWindowInteractor since it uses X11 and will cause compilation problems for us. Open Rendering/vtkGraphicsFactory.cxx and comment out all lines related to that class. See lines: 57, 171-174.
  • I had to also change a few things in CMakeLists.txt. Since I want to use our Mesa as the OpenGL that will be used in VTK, edit the "Configure OpenGL support" section around line 976 so it looks like:
#-----------------------------------------------------------------------------
# Configure OpenGL support.
IF(VTK_USE_RENDERING)
#  INCLUDE(${CMAKE_ROOT}/Modules/FindOpenGL.cmake)
  SET(BGP_MESADIR /g/g19/whitlocb/Development/visit/mesa/6.4.2/linux-ppc64_bgp_gcc-4.1.2)
  SET(OPENGL_FOUND       1)
  SET(OPENGL_XMESA_FOUND 0)
  SET(OPENGL_GLU_FOUND   1)
  SET(OPENGL_INCLUDE_DIR ${BGP_MESADIR}/include)
  SET(OPENGL_LIBRARIES   -lGL)
  SET(OPENGL_LIBRARY ${OPENGL_LIBRARIES}) # Needed to set VTK_USE_OPENGL_LIBRARY

  SET(OPENGL_gl_LIBRARY  ${BGP_MESADIR}/lib/libGL.so)
  SET(OPENGL_glu_LIBRARY ${BGP_MESADIR}/lib/libGLU.so)

  MARK_AS_ADVANCED(
    OPENGL_INCLUDE_DIR
    OPENGL_xmesa_INCLUDE_DIR
    OPENGL_glu_LIBRARY
    OPENGL_gl_LIBRARY
  )
ENDIF(VTK_USE_RENDERING)
  • In the "Determine GUI" section around line 226, change SET(VTK_USE_X_FORCE ${VTK_USE_RENDERING}) to SET(VTK_USE_X_FORCE 0)
  • I created a toolchain file (in ~/Development) for cmake and ran the following:
CMAKE_BIN="/g/g19/whitlocb/Development/visit/cmake/2.4.5/linux-ppc64_gcc-4.1.2/bin/cmake"
MESA_VERSION="6.4.2"
VISITDIR="/g/g19/whitlocb/Development/visit"
CXX_OPT_FLAGS="-fPIC"
C_OPT_FLAGS="-fPIC"
CXX_COMPILER="/bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/g++"
C_COMPILER="/bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/gcc"
VISITARCH="linux-ppc64_bgp_gcc-4.1.2"
SO_EXT="so"
${CMAKE_BIN} \
        -DCMAKE_TOOLCHAIN_FILE="/g/g19/whitlocb/Development/ToolChain-BlueGeneP.cmake"\
        -DBUILD_SHARED_LIBS:BOOL=ON\
        -DBUILD_TESTING:BOOL=OFF\
        -DUSE_ANSI_STD_LIB:BOOL=ON\
        -DVTK_USE_HYBRID:BOOL=ON\
        -DVTK_USE_VOLUMERENDERING:BOOL=OFF\
        -DCMAKE_CXX_FLAGS:STRING="${CXX_OPT_FLAGS}"\
        -DCMAKE_CXX_COMPILER:STRING=${CXX_COMPILER}\
        -DCMAKE_C_FLAGS:STRING="${C_OPT_FLAGS}"\
        -DCMAKE_C_COMPILER:STRING=${C_COMPILER}\
        -DCMAKE_VERBOSE_MAKEFILE:BOOL=TRUE \
        .

Note that later on in the Rendering library, there's a custom command that the build tries to run in order to parse the OpenGL extensions. That command is cross-compiled and won't run. You'll have to copy a version from somewhere else that is compiled to run on the login node.

Also, VTK is configured to try and build vtkXOpenGLRenderWindow but we've disabled VTK from looking for X11 since it probably does not exist on CNK. I've created a new implementation that is a copy of my vtkOSMesaRenderWindow class that provides an offscreen render window. With this change, the VTK rendering library compiles.

The build then continues on to 95% and fails trying to build the VTK Volume rendering library, which we don't use anyway. The libraries that we do need seem to have been created.

Example toolchain file

# the name of the target operating system
SET(CMAKE_SYSTEM_NAME BlueGeneP)

# set the compiler
set(CMAKE_C_COMPILER /bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/gcc )
set(CMAKE_CXX_COMPILER /bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/g++ )

# set the search path for the environment coming with the compiler
# and a directory where you can install your own compiled software
set(CMAKE_FIND_ROOT_PATH
    /bgsys/drivers/ppcfloor
    /bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin
)

# adjust the default behaviour of the FIND_XXX() commands:
# search headers and libraries in the target environment, search 
# programs in the host environment
set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)
set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)
set(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)

Building SZIP

I ran build_visit to start building the rest of the I/O libraries and szip immediately failed. It seems that the library built okay but the example program does not and it causes the build to fail. I went into the szip-2.1 directory and did make;make install and got it to install.

Here's my build_visit line:

env C_COMPILER=/bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/gcc \
   CXX_COMPILER=/bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/g++ \
   VISITARCH=linux-ppc64_bgp_gcc-4.1.2 \
   build_visit --console --makeflags -j4 --szip --no-thirdparty --no-visit

Building HDF5

I started with this build_visit line but had to start building by hand due to problems.

env C_COMPILER=/bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/gcc \
   CXX_COMPILER=/bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/g++ \
   VISITARCH=linux-ppc64_bgp_gcc-4.1.2 \
   build_visit --console --makeflags -j4 --szip --hdf5 --no-thirdparty --no-visit

When I got down into building the src directory, the H5Detect program, could not be linked successfully against the szip library. It doesn't really matter because the program would not be able to run anyway. I just built H5Detect with the login's gcc and omitted the libraries. That let it run and generate code. I think the data type sizes should be compatible since we're building 32 bit code...

Ultimately, another program failed to build but "make install" worked well enough to get the includes and library to install before failing again.

Building NETCDF

NETCDF built okay.

env C_COMPILER=/bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/gcc \
   CXX_COMPILER=/bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/g++ \
   VISITARCH=linux-ppc64_bgp_gcc-4.1.2 \
   build_visit --console --makeflags -j4 --netcdf --no-thirdparty --no-visit

Building CGNS

CGNS did not present any problems with this build_visit line:

env C_COMPILER=/bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/gcc \
   CXX_COMPILER=/bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/g++ \
   VISITARCH=linux-ppc64_bgp_gcc-4.1.2 \
   build_visit --console --makeflags -j4 --cgns --no-thirdparty --no-visit

Building Silo

Silo's configure chokes trying to detect H5open or something like that so Silo had to be built be hand. Things may be better now with Silo 4.7.2 but this was with Silo 4.6.2 and 4.7. I edited Silo's configure and added notfound="" before all of the HDF5-related tests for that variable to help the logic go into the areas that I wanted. I also passed --disable-silex --disable-hzip on the configure command line.

setenv PREFIX /g/g19/whitlocb/Development/visit
./configure \
    CXX=/bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/g++ \
    CC=/bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/gcc \
    CFLAGS=-fPIC \
    CXXFLAGS=-fPIC \
    --prefix=$PREFIX/silo/4.6.2/linux-ppc64_bgp_gcc-4.1.2 \
    --with-hdf5=$PREFIX/hdf5/1.8.1/linux-ppc64_bgp_gcc-4.1.2/include,$PREFIX/hdf5/1.8.1/linux-ppc64_bgp_gcc-4.1.2/lib \
    --with-szlib=$PREFIX/szip/2.1/linux-ppc64_bgp_gcc-4.1.2 \
    --without-readline --disable-fortran --disable-browser --disable-silex --disable-hzip

After all of this, the code built and I was able to do "make install".

I ran into problems with some files and Silo 4.6.2 so I'm trying Silo 4.7. I edited configure once again adding notfound="" at line 26451, 26716. Once I configured, I also had to edit config.h and make sure that HDF5-related variables were set, including HAVE_LIBHDF5.

Building VisIt

Old compile line:

configure CXXFLAGS=-g BUILD_MODE=ComputeNode --enable-engine-only --disable-glew \
    --enable-parallel --without-x --disable-select --disable-nospin-bcast

config-site.conf

This is obsolete and something I'd need to port to cmake but this is what I used to configure VisIt's old build system for building on BlueGene/P:

##
## Set the VISITHOME environment variable.
##
VISITHOME=/g/g19/whitlocb/Development/visit

##
## Compiler flags.
##
if test "$BUILD_MODE" == "ComputeNode"; then
    # My current configure line is:
    # configure CXXFLAGS=-g BUILD_MODE=ComputeNode --enable-engine-only --disable-glew --enable-parallel --without-x --disable-select --disable-nospin-bcast

    CC="/bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/gcc"
    CXX="/bgsys/drivers/ppcfloor/gnu-linux/powerpc-bgp-linux/bin/g++"
    CFLAGS="-O2 -fPIC"
    CXXFLAGS="-O2 -fPIC"
    LDFLAGS="-g"
    EXE_LDFLAGS="-Wl,-call_shared"
    VISITARCH="linux-ppc64_bgp_gcc-4.1.2"
    QTARCH="linux-ppc64_gcc-4.1.2"
    XML2MAKEFILE_CXX="g++"

    MESA_VERSION="6.4.2"
    SILO_VERSION="4.7"

    # Set the parallel options
    BGP="/bgsys/drivers/ppcfloor"
    MPI_LDFLAGS="-L$BGP/comm/lib -Wl,-rpath,$BGP/comm/default/lib -L$BGP/comm/sys/lib -Wl,-rpath,$BGP/comm/sys/lib -L$BGP/runtime/SPI -Wl,-rpath,$BGP/runtime/SPI $LDFLAGS"
    CPPFLAGS="-DVISIT_BLUE_GENE_P -DMPICH_IGNORE_CXX_SEEK -I$BGP/comm/include $CPPFLAGS"
    MPI_LIBS="$MPI_LDFLAGS -lcxxmpich.cnk -lmpich.cnk -ldcmfcoll.cnk -ldcmf.cnk -lpthread -lrt -L$BGP/runtime/SPI -lSPI.cna"

    # Enable Ice-T
    with_icet_includedir="$VISITHOME/icet/0.5.4/$VISITARCH/include"
    with_icet_libdir="$VISITHOME/icet/0.5.4/$VISITARCH/lib"
    enable_icet="yes"
elif test "$BUILD_MODE" == "ComputeNodeXLC"; then
    # My current configure line is:
    # configure CXXFLAGS=-g BUILD_MODE=ComputeNodeXLC --enable-engine-only --disable-glew --enable-parallel --without-x --disable-select --disable-nospin-bcast

    CC=bgxlc
    CXX=bgxlC
    CFLAGS="-g -qpic=large -qnocommon -qarch=450d"
    CXXFLAGS="-g -qpic=large -qnocommon -qarch=450d"
    LDFLAGS="-g"
    EXE_LDFLAGS="-qnostaticlink"
    VISITARCH="linux-ppc64_bgp_xlc"
    QTARCH="linux-ppc64_gcc-4.1.2"
    XML2MAKEFILE_CXX="g++"

    MESA_VERSION="6.4.2"
    SILO_VERSION="4.7"

    # Set the parallel options
    BGP="/bgsys/drivers/ppcfloor"
    MPI_LDFLAGS="-L$BGP/comm/lib -Wl,-rpath,$BGP/comm/default/lib -L$BGP/comm/sys/lib -Wl,-rpath,$BGP/comm/sys/lib -L$BGP/runtime/SPI -Wl,-rpath,$BGP/runtime/SPI $LDFLAGS"
    CPPFLAGS="-DVISIT_BLUE_GENE_P -DMPICH_IGNORE_CXX_SEEK -I$BGP/comm/include $CPPFLAGS"
    MPI_LIBS="$MPI_LDFLAGS -lcxxmpich.cnk -lmpich.cnk -ldcmfcoll.cnk -ldcmf.cnk -lpthread -lrt -L$BGP/runtime/SPI -lSPI.cna"
else
# These are the options that are for xlc. They are right but there's an
# error somewhere that prevents libvisit_vtk from loading.
#    CC="xlc"
#    CXX="xlC"
#    CFLAGS="-O2 -qpic=large"
#    CXXFLAGS="-O2 -qpic=large"
#    VISITARCH="linux-ppc64_xlc"

# In the meantime, use gcc
    CC="gcc"
    CXX="g++"
    CFLAGS="-O2 -fPIC"
    CXXFLAGS="-O2 -fPIC"
    VISITARCH="linux-ppc64_gcc-4.1.2"
    QTARCH="linux-ppc64_gcc-4.1.2"

    MESA_VERSION="5.0"
    SILO_VERSION="4.6.2"
fi

##
## Specify the location of the mesa include files and libraries.
##
MESA=$VISITHOME/mesa/$MESA_VERSION/$VISITARCH

##
## Specify the location of the vtk include files and libraries.
##
VTK=$VISITHOME/vtk/5.0.0c/$VISITARCH

##
## Specify the location of cmake.
##
DEFAULT_CMAKE=$VISITHOME/cmake/2.4.5/$VISITARCH/bin/cmake

##
## Specify the location of the qt include files and libraries.
##
QT_BIN=$VISITHOME/qt/3.3.8/$QTARCH/bin
QT_INCLUDE=$VISITHOME/qt/3.3.8/$QTARCH/include
QT_LIB=$VISITHOME/qt/3.3.8/$QTARCH/lib

##
## Specify the location of the python include and libraries.
##
PYDIR=$VISITHOME/python/2.5/$VISITARCH
PYVERSION=python2.5

##
## Database reader plugin support libraries
##
##############################################################

##
## CGNS
##
DEFAULT_CGNS_INCLUDE=$VISITHOME/cgns/2.4/$VISITARCH/include
DEFAULT_CGNS_LIB=$VISITHOME/cgns/2.4/$VISITARCH/lib

##
## HDF5
##
DEFAULT_HDF5_LIBLOC=$VISITHOME/hdf5/1.8.1/$VISITARCH
DEFAULT_HDF5_LIBDEP=-L$VISITHOME/szip/2.1/$VISITARCH/lib,-lsz

##
## NetCDF
##
DEFAULT_NETCDF_INCLUDE=$VISITHOME/netcdf/3.6.3/$VISITARCH/include
DEFAULT_NETCDF_LIB=$VISITHOME/netcdf/3.6.3/$VISITARCH/lib

##
## SZIP
##
DEFAULT_SZIP_LIB=$VISITHOME/szip/2.1/$VISITARCH/lib

##
## Silo
##
DEFAULT_SILO_LIBLOC=$VISITHOME/silo/$SILO_VERSION/$VISITARCH
DEFAULT_SILO_LIBDEP=-L$DEFAULT_HDF5_LIBLOC/lib,-lhdf5,$DEFAULT_HDF5_LIBDEP

Concerns for 2.0

VisIt 2.0 has many changes from VisIt 1.12 and has so far not been build on BlueGene/P. There are no doubt some new problems that would require additional porting:

  1. VisIt now gets all of its GL functions via a special GLEW library so it can switch between OpenGL and Mesa on the fly. Since there is only Mesa on BlueGene/P it's unknown what this change would do, especially because of its reliance on dynamic symbol loading. Expect some porting in this area
  2. VisIt now uses cmake for its build system. Care was taken to add an engine-only build mode and flags to disable select calls and so on. Not all of those options may have gone into the cmake build since a BGP build has not been attempted under the new build system.