Westgrid just acquired a couple of
new IBM
pSeries 595 compute nodes, and a pSeries
500 head node. Even though my thesis is coming down to the wire, I
just had to try Phred out on these machines. Unfortunately, IBM's
VisualAge C++ compiler still doesn't get along with boost, so I had to build it with GCC. I think it may be possible to
build everything except the Python parts with VisualAge, and the
Python parts with GCC.
In any case, I had to build Python, boost.python, netCDF,
etc. Since the machine is 64 bit, I wanted to build the binaries in 64
bits, but by default everything compiles into 32 bits. Here's how to
build 64 bit Python and boost.python using GCC on AIX:
- export OBJECT_MODE=64
- export CFLAGS=-maix64
- export CXXFLAGS=-maix64
- Untar the Python source code tarball, cd into the resulting
directory.
- Run configure: ./configure --prefix=/whereever/python2.4_64 --disable-ipv6 --enable-shared --with-gcc=gcc --without-threads
- Edit the Makefile. Find the place where the OPT variable is set,
and add -maix64 to it. configure doesn't seem to put it there, as it
should.
- make; make install
- Now, download and untar the boost source file.
- Install bjam somewhere if you don't already have it.
- Run bjam "-sBUILD=<address-model>64 <cxxflags>-maix64 <linkflags>-maix64" -sTOOLS=gcc --prefix=/whereever/boost64 --layout=system install
- Done!
Building Phred is also kind of interesting, because of some silly
things Boost.python needs to know about. Also, getting all the 64 bit
flags right was tricky.
Since I build Python with threads, it was also necessary to build
Phred with threaded MPI libraries. GCC on AIX can be told to compile
in the MPI libraries by using the -mpe switch. This has much
the same result as using a mpCC wrapper script. The problem is that
GCC assumes that you'll be using the non-threaded versions of the
libraries. To correct this, it is necessary to create a new spec
file.
- Run g++ -dumpspecs > gcc_specs. This writes the current
spec file to disk.
- Edit gcc_specs. Search for mpe in the file. There should
be -lmpi and -lvtd nearby. Change these to
-lmpi_r and -lvtd_r.
- Tell gcc to use the new spec file: export CXXFLAGS="$CXXFLAGS
-specs=/path_to/gcc_specs", export LDFLAGS="$LDFLAGS -specs=/path_to/gcc_specs"
The next problem is that on AIX, Python has to tell the linker
about a set of symbols it exports. This list of symbols is stored in a
file somewhere like lib/python2.4/config/python.exp. Python's
distutils package can be used to get the necessary flags to pass the
compiler, except the value it returns doesn't contain the full path to
the file. As a stop gap solution, I created a directory call Modules
in the directory configure was being run from, and softlinked the
python.exp file into it. It is also necessary to create a Modules
directory and softlink the file in the src directory which is
created by configure.
- GCC doesn't appear to use threading by default, but it's required
for everything to work correctly. Set export CXXFLAGS="$CXXFLAGS
-pthread"
- Create a directory outside the source tree to build the source
in. This allows multiple binary trees (with different compiler options
for instance) with a single source tree. I use to binary directories
generally, debug-phred and opt-phred.
- Run configure: ../phred/configure --with-cxx="g++" --with-boost=/home/mch/opt/boost64 --enable-64bit
- make
- Run a test if everything works: src/phred -t M
Phred seems to perform pretty well on the 550 and 595's. I haven't
tweaked any settings yet, but on a single POWER5 1.9 GHz processor, it
makes about 16 million nodes per second, for a binary built with
GCC. Compare that to a single 3.0 GHz Intel Pentium 4, which makes
about 12 million nodes per second, for a binary built with Intel's
compiler (~12% faster than GCC built binaries on x86
machines). Tweaking the compiler options (assume no aliasing) boosted
performance on the Pentiums to about 19 MNPS, so I expect a similar
increase or better on the POWER5. Especially if I can get the
VisualAge compiler and GCC to play nice together.