Wednesday 19 August 2009

Compiling bleeding edge SciPy on Mac OS X

I do most of my number crunching computing task with SciPy these days, having basically kicked the matlab habit with the brief exception of occasional use of legacy libraries. SciPy is a joy to work with, but is a huge pain to build from source, in light of nasty dependancies (fortran things mostly) and some system specific hardware acceleration trickiness. Thankfully most users can download one of many prebuilt packages, perhaps the best being enthought's. If you've ever wanted to see what SciPy is all about, this is the easiest way to do so.

That said, one of the great things about using SciPy instead of matlab is that it's python. Except all the prebuilt binaries (to my knowledge anyway) use at newest python 2.5. Again, probably not a problem for most, but I use the nice socket library (amongst other things) that's been improved significantly in python 2.6. So for a while I had my SciPy python and my everything else python and every so often I would make another attempt building SciPy for py2.6 on my mac to integrate the two and every time it would defeat me.

Until yesterday.

So I'm going to attempt to fully document the process as I've now done it on 2 similar machines (home & lab) and now that I've figured out the tricky bits it seems fairly easy to reproduce. These instruction were followed on 1 - 2 year old intel based macs running 10.5.8. (Note these instructions don't touch on installing the other pieces of standard SciPy setup, ipython and pylab/matplotlib as I've never had much trouble getting these to build. I believe the easy_install process works for both, mostly)

  1. (optional) If you don't want to build universal python modules remove "-arch ppc -arch i386" from the BASECFLAGS and the LDFLAGS in the python library Makefile, which should live somewhere around here: /Library/Frameworks/Python.framework/Versions/Current/lib/python2.6/config/Makefile
  2. If you don't already have xCode 3.1.3 and the associated developer tools you need to get it for apple's custom build of gcc 4.2 (it's not the version that comes with most box copies of 10.5). Download and install a fresh copy of the Apple Developer Tools. You can get SciPy to compile with other variants of gcc 4.2 or greater (from MacPorts for instance) but they don't support apple specific options, which are very helpful in other situations.
  3. Download and install gFortran as a patch to the apple gcc from att research. Why apple doesn't leave gfortran in gcc I don't know, but they don't and we need it. It's critical you use this fortran compiler as other variants of gfortran or g77 seem to cause errors.
  4. Download and install UMFPACK and AMD from SuiteSparse. The easiest way I've gotten through this is to download the entire SuiteSparse and then do the following:
    1. Modifiy the package wide config makefile found at SuiteSparse/UFconfig/UFconfig.mk by uncommenting the Macintosh options (currently lines 299 - 303)
    2. In order to only compile the 2 packages we also need to modify the high level makefile (SuiteSparse/Makefile) by commenting out the references to the other packages under the default call (currently lines 10, 12-17, 19-24).
    3. run make while in the SuiteSparse dir
    4. because it would be too easy if SuiteSparse actually had an install routine, we have to install the just compiled libs ourselves. This is how I did it, though you can stick all these bits wherever you like as long as the python compiler and linker will see them:
      $sudo install UMFPACK/Include/* /usr/local/include/
      $sudo install AMD/Include/* /usr/local/include/
      $sudo install UMFPACK/Lib/* /usr/local/lib/
      $sudo install AMD/Lib/* /usr/local/lib/
      $sudo install UFconfig/UFconfig.h /usr/local/include/
  5. Grab a bleeding edge copy of SciPy and NumPy via their subversion repositories.:
    $svn co http://svn.scipy.org/svn/numpy/trunk numpy-from-svn
    $svn co http://svn.scipy.org/svn/scipy/trunk scipy-from-svn
  6. Build and install NumPy:
    $cd numpy-from-svn
    $sudo python setup.py build --fcompiler=gnu95 install
  7. Test NumPy to make sure it's not broken (note that the tests need to be run out of the build directory):
    $cd ..
    $python -c "import numpy;numpy.test()"

    Make sure numpy doesn't fail any of the tests (known fails and skips are okay) or the next bit may not work.
  8. Similar to step 6, build and install SciPy:
    $cd scipy-from-svn
    $sudo python setup.py config_fc --fcompiler=gfortran install
  9. similar to step 7, move out of the build directory and run the built in tests:
    $cd ..
    $python -c "import scipy;scipy.test()"

    You're going to get some fails and maybe some errors. You're going to have to use your own judgement to as to whether these errors and fails are substantial. Most of the troubles I've encountered are trivial, things like a type being dtype('int32') instead of 'int32' which is actually the same and just needs to be updated to reflect newer numpy.
And now you have a nice SciPy build for whatever flavor of python you're working with on you Mac. Note that I have no idea how well this will work in anything other than python 2.6 on Mac OSX 10.5.8, though it will probably mostly work with other variants. Also, for completeness, I most recently compiled these versions: NumPy-r7303, SciPy-r5893. At some point I'm going to give it a go with python 3.x but that will be a whole new kind of pain I suspect. Anyway, if anyone uses these instructions and they don't quite work or you don't understand part of it, please let me know and I'll try to clarify or help as best I can. I'd really love to build a definitive set of instructions for building SciPy on a Mac, but I can only verify these instructions on my machines.