St. Olaf Beowulf Blog

Wednesday, June 28, 2006

Booting Supplemental CD from a PXE Server

Useful Sun Link:

http://docs.sun.com/source/819-3721-11/AppD.html

Tuesday, June 27, 2006

New Sun x2100s arrived!

Wondering why I haven't updated the blog all day yet today?  Our new Sun x2100 servers arrived!  All 16 of them.

Unfortunately, the first one I opened was DOA.  Sad.  Right now, I'm network installing fedora core 5 x86_64 onto one.  More tomorrow!

More MPI v1.1 Issues

I got MPI v1.1 to install correctly now, but there appear to be a few bugs in it.  Several applications that depend on MPI now fail.  The problem is that values are ever so slightly off.  Hopefully this will be fixed in a forthcoming update.

Monday, June 26, 2006

BLACS Compile Issues

When compiling BLACS, it game me some errors about mpif-common.h, mpif-config.h not being openable.

To solve this, go to the SRC/MPI/INTERNAL directory, and make a symbolic link to the two files (they should be in /usr/local/include)

BLACS working config file

This file will make BLACS install correctly:
#=============================================================================
#====================== SECTION 1: PATHS AND LIBRARIES =======================
#=============================================================================
#  The following macros specify the name and location of libraries required by
#  the BLACS and its tester.
#=============================================================================
#  --------------------------------------
#  Make sure we've got a consistent shell
#  --------------------------------------
   SHELL = /bin/sh

#  -----------------------------
#  The top level BLACS directory
#  -----------------------------
   BTOPdir = /usr/local/BLACS

#  ---------------------------------------------------------------------------
#  The communication library your BLACS have been written for.
#  Known choices (and the machines they run on) are:
#
#     COMMLIB   MACHINE
#     .......   ..............................................................
#     CMMD      Thinking Machine's CM-5
#     MPI       Wide variety of systems
#     MPL       IBM's SP series (SP1 and SP2)
#     NX        Intel's supercomputer series (iPSC2, iPSC/860, DELTA, PARAGON)
#     PVM       Most unix machines; See PVM User's Guide for details
#  ---------------------------------------------------------------------------
   COMMLIB = MPI

#  -------------------------------------------------------------
#  The platform identifier to suffix to the end of library names
#  -------------------------------------------------------------
   PLAT = LINUX

#  ----------------------------------------------------------
#  Name and location of the BLACS library.  See section 2 for
#  details on BLACS debug level (BLACSDBGLVL).
#  ----------------------------------------------------------
   BLACSdir    = $(BTOPdir)/LIB
   BLACSDBGLVL = 0
   BLACSFINIT  = $(BLACSdir)/blacsF77init_$(COMMLIB)-$(PLAT)-$(BLACSDBGLVL).a
   BLACSCINIT  = $(BLACSdir)/blacsCinit_$(COMMLIB)-$(PLAT)-$(BLACSDBGLVL).a
   BLACSLIB    = $(BLACSdir)/blacs_$(COMMLIB)-$(PLAT)-$(BLACSDBGLVL).a

#  -------------------------------------
#  Name and location of the MPI library.
#  -------------------------------------
   MPIdir = /usr/local
   MPIdev =
   MPIplat = LINUX
   MPILIBdir =
   MPIINCdir = $(MPIdir)/include
   MPILIB =

#  -------------------------------------
#  All libraries required by the tester.
#  -------------------------------------
   BTLIBS = $(BLACSFINIT) $(BLACSLIB) $(BLACSFINIT) $(MPILIB)

#  ----------------------------------------------------------------
#  The directory to put the installation help routines' executables
#  ----------------------------------------------------------------
   INSTdir = $(BTOPdir)/INSTALL/EXE

#  ------------------------------------------------
#  The name and location of the tester's executable
#  ------------------------------------------------
   TESTdir = $(BTOPdir)/TESTING/EXE
   FTESTexe = $(TESTdir)/xFbtest_$(COMMLIB)-$(PLAT)-$(BLACSDBGLVL)
   CTESTexe = $(TESTdir)/xCbtest_$(COMMLIB)-$(PLAT)-$(BLACSDBGLVL)
#=============================================================================
#=============================== End SECTION 1 ===============================
#=============================================================================

#=============================================================================
#========================= SECTION 2: BLACS INTERNALS ========================
#=============================================================================
#  The following macro definitions set preprocessor values for the BLACS.
#  The file Bconfig.h sets these values if they are not set by the makefile.
#  User's compiling only the tester can skip this entire section.
#  NOTE: The MPI defaults have been set for MPICH.
#=============================================================================

#  -----------------------------------------------------------------------
#  The directory to find the required communication library include files,
#  if they are required by your system.
#  -----------------------------------------------------------------------
   SYSINC =

#  ---------------------------------------------------------------------------
#  The Fortran 77 to C interface to be used.  If you are unsure of the correct
#  setting for your platform, compile and run BLACS/INSTALL/xintface.
#  Choices are: Add_, NoChange, UpCase, or f77IsF2C.
#  ---------------------------------------------------------------------------
   INTFACE = -DAdd_

#  ------------------------------------------------------------------------
#  Allows the user to vary the topologies that the BLACS default topologies
#  (TOP = ' ') correspond to.  If you wish to use a particular topology
#  (as opposed to letting the BLACS make the choice), uncomment the
#  following macros, and replace the character in single quotes with the
#  topology of your choice.
#  ------------------------------------------------------------------------
#  DEFBSTOP   = -DDefBSTop="'1'"
#  DEFCOMBTOP = -DDefCombTop="'1'"

#  -------------------------------------------------------------------
#  If your MPI_Send is locally-blocking, substitute the following line
#  for the empty macro definition below.
#  SENDIS = -DSndIsLocBlk
#  -------------------------------------------------------------------
   SENDIS =

#  --------------------------------------------------------------------
#  If your MPI handles packing of non-contiguous messages by copying to
#  another buffer or sending extra bytes, better performance may be
#  obtained by replacing the empty macro definition below with the
#  macro definition on the following line.
#  BUFF = -DNoMpiBuff
#  --------------------------------------------------------------------
   BUFF =

#  -----------------------------------------------------------------------
#  If you know something about your system, you may make it easier for the
#  BLACS to translate between C and fortran communicators.  If the empty
#  macro defininition is left alone, this translation will cause the C
#  BLACS to globally block for MPI_COMM_WORLD on calls to BLACS_GRIDINIT
#  and BLACS_GRIDMAP.  If you choose one of the options for translating
#  the context, neither the C or fortran calls will globally block.
#  If you are using MPICH, or a derivitive system, you can replace the
#  empty macro definition below with the following (note that if you let
#  MPICH do the translation between C and fortran, you must also indicate
#  here if your system has pointers that are longer than integers.  If so,
#  define -DPOINTER_64_BITS=1.)  For help on setting TRANSCOMM, you can
#  run BLACS/INSTALL/xtc_CsameF77 and BLACS/INSTALL/xtc_UseMpich as
#  explained in BLACS/INSTALL/README.
   TRANSCOMM =
#
#  If you know that your MPI uses the same handles for fortran and C
#  communicators, you can replace the empty macro definition below with
#  the macro definition on the following line.
#  TRANSCOMM = -DCSameF77
#  -----------------------------------------------------------------------
#  TRANSCOMM =

#  --------------------------------------------------------------------------
#  You may choose to have the BLACS internally call either the C or Fortran77
#  interface to MPI by varying the following macro.  If TRANSCOMM is left
#  empty, the C interface BLACS_GRIDMAP/BLACS_GRIDINIT will globally-block if
#  you choose to use the fortran internals, and the fortran interface will
#  block if you choose to use the C internals.  It is recommended that the
#  user leave this macro definition blank, unless there is a strong reason
#  to prefer one MPI interface over the other.
#  WHATMPI = -DUseF77Mpi
#  WHATMPI = -DUseCMpi
#  --------------------------------------------------------------------------
   WHATMPI =

#  ---------------------------------------------------------------------------
#  Some early versions of MPICH and its derivatives cannot handle user defined
#  zero byte data types.  If your system has this problem (compile and run
#  BLACS/INSTALL/xsyserrors to check if unsure), replace the empty macro
#  definition below with the macro definition on the following line.
#  SYSERRORS = -DZeroByteTypeBug
#  ---------------------------------------------------------------------------
   SYSERRORS =

#  ------------------------------------------------------------------
#  These macros set the debug level for the BLACS.  The fastest
#  code is produced by BlacsDebugLvl 0.  Higher levels provide
#  more debug information at the cost of performance.  Present levels
#  of debug are:
#  0 : No debug information
#  1 : Mainly parameter checking.
#  ------------------------------------------------------------------
   DEBUGLVL = -DBlacsDebugLvl=$(BLACSDBGLVL)

#  -------------------------------------------------------------------------
#  All BLACS definitions needed for compile (DEFS1 contains definitions used
#  by all BLACS versions).
#  -------------------------------------------------------------------------
   DEFS1 = -DSYSINC $(SYSINC) $(INTFACE) $(DEFBSTOP) $(DEFCOMBTOP) $(DEBUGLVL)
   BLACSDEFS = $(DEFS1) $(SENDIS) $(BUFF) $(TRANSCOMM) $(WHATMPI) $(SYSERRORS)
#=============================================================================
#=============================== End SECTION 2 ===============================
#=============================================================================

#=============================================================================
#=========================== SECTION 3: COMPILERS ============================
#=============================================================================
#  The following macros specify compilers, linker/loaders, the archiver,
#  and their options.  Some of the fortran files need to be compiled with no
#  optimization.  This is the F77NO_OPTFLAG.  The usage of the remaining
#  macros should be obvious from the names.
#=============================================================================
   F77            = gfortran
   F77NO_OPTFLAGS =
   F77FLAGS       = $(F77NO_OPTFLAGS) -O
   F77LOADER      = $(F77)
   F77LOADFLAGS   =
   CC             = gcc
   CCFLAGS        = -O4
   CCLOADER       = $(CC)
   CCLOADFLAGS    =

#  --------------------------------------------------------------------------
#  The archiver and the flag(s) to use when building an archive (library).
#  Also the ranlib routine.  If your system has no ranlib, set RANLIB = echo.
#  --------------------------------------------------------------------------
   ARCH      = ar
   ARCHFLAGS = r
   RANLIB    = ranlib

#=============================================================================
#=============================== End SECTION 3 ===============================
#=============================================================================


Note that it may be necessary to create /usr/local/BLACS and maybe /usr/local/BLACS/LIB first (among other directories, like EXE. . . read the config file)

MPI v1.1 Update

As I mentioned earlier (I think...it is Monday afterall), MPI v1.1 came out last Friday and I promptly installed it.  Unfortunately, I'm getting a few errors when I run mpicc on the head node, and runtime errors when running applications on the nodes.  I get the impression that one must retain the mpi installation directory for the 'make uninstall' command, so that's what I'm trying right now.

The headnode error, to be more specific, mentions are parsing error in the 'keyvalue parser', and it references some file.  The thing is, that file is identical to the file on the child nodes, and yet the child nodes aren't spewing an error.

Gotta love troubleshooting.

Friday, June 23, 2006

More distcc

So, add the path stuff to each user's .bash_profile

and, add this line
DISTCC_DIR=~/.distcc
to /etc/profile and be sure to export the new variable as well

Also, to make things more automagical, add a hosts file to ~/.distcc/hosts

distcc path

Correction -- do NOT put PATH=/usr/lib/distcc/bin:$PATH into

T /etc/profile, as it will cause problems with distcc (because of problems relating to the masquerading stuff).  Instead, manually put it into root and wolf's ~/.bash_profile

Open MPI v1.1

Open MPI v1.1 was just released, so I'm going to go ahead and compile and install that.

I ran into issues with distcc last night that I hope to fix this morning :: cross fingers ::

Editing path for distcc

After setting up distcc in 'masquerade' mode, you'll want to edit the system-wide path settings.  The easiest way to accomplish this is to edit /etc/profile and add this line before PATH gets exported:
PATH=/usr/lib/distcc/bin:$PATH

That'll insure that the symbolic links created during the masquerade process get found before the 'official' versions of the gcc suite

Installing distcc

distcc comes with very incomplete install directions.  Also, embedded in the contrib/redhat directory is an init script for distccd.  Here's my modified copy:
#!/bin/sh
#
# Init file for Distccd - A distributed compilation front-end.
# WARNING: Don't enable on untrusted networks
#
# Written by Dag Wieers .
#
# chkconfig: - 80 20
# description: Distccd - distributed compilation front-end (daemon)     #               WARNING: Don't enable on untrusted networks
#
# processname: distccd
#
# config: /etc/sysconfig/distccd


source /etc/init.d/functions
source /etc/sysconfig/network

### Check that networking is up.
[ "${NETWORKING}" == "no" ] && exit 0

[ -x "/usr/local/bin/distccd" ] || exit 1

### Default variables
SYSCONFIG="/etc/sysconfig/distccd"
OPTIONS="--allow=0.0.0.0/0"
USER="nobody"
DISTCCPATH="$PATH"

### Read configuration
[ -r "$SYSCONFIG" ] && source "$SYSCONFIG"

RETVAL=0
prog="distccd"
desc="Distributed Compiler daemon"

start() {
        echo -n $"Starting $desc ($prog): "
        PATH="$DISTCCPATH" daemon --user "$USER" $prog --daemon --log-file="/var/log/distccd.log" $OPTIONS
        RETVAL=$?
        echo
        [ $RETVAL -eq 0 ] && touch /var/lock/subsys/$prog
        return $RETVAL
}

stop() {
        echo -n $"Shutting down $desc ($prog): "
        killproc $prog
        RETVAL=$?
        echo
        [ $RETVAL -eq 0 ] && rm -f /var/lock/subsys/$prog
        return $RETVAL
}

restart() {
        stop
        start
}

case "$1" in
  start)
        start
        ;;
  stop)
        stop
        ;;
  restart|reload)
        restart
        ;;
  condrestart)
        [ -e /var/lock/subsys/$prog ] && restart
        RETVAL=$?
        ;;
  status)
        status $prog
        RETVAL=$?
        ;;
  *)
        echo $"Usage $0 {start|stop|restart|condrestart|status}"
        RETVAL=1
esac

exit $RETVAL

The differences between this and the default script include the --allow=0.0.0.0/0 part and the location of the binary (/usr/bin/distccd to /usr/local/bin/distccd)

Thursday, June 22, 2006

Updating autoinstall script

Every time the image is updated, you must either run:

si_mkautoinstallscript --image firstfc5 --force --ip-assignment DHCP --post-install reboot

or tell the script that you do NOT want to remake the autoinstall script (which happens with the si_getimage command)

SystemImager things to start

SystemImagerr's documentation does not tell you to turn on the startup scripts on the imaging server (or what they do).

So do this on the head server:
chkconfig systemimager-server-monitord on
chkconfig systemimager-server-netbootmond on
chkconfig systemimager-server-rsyncd on

rsyncd makes the rsync daemon work, netbootmond, if you have 
NET_BOOT_DEFAULT = local in /etc/systemimager/systemimager.conf, will make an always-netboot client boot from the hard drive on next boot
monitord keeps logs of everything

Restart after doing this

Shared Libraries

It's important for each node to know where shared libraries exist.  /lib and /usr/lib are trusted lib directories, but anything else needs to be manually added.  One 'safe' approach is to add a file whatever.conf to /etc/ld.so.conf.d/ -- whatever.conf would contain the directory of additional shared libraries.

You can also use something lide LD_LIBRARY_PATH or something like that, but I've heard it's insecure (for whatever reason)

Wednesday, June 21, 2006

Passwordless SSH

I keep forgetting to mention this, but it is really important to enable passwordless ssh logins on each of the node machines with the wolf login.  Instructions can be found here.

Change default runlevel; imaging and labels

To change the default runlevel in linux, open up the /etc/inittab file, and look for this line:
id:5:initdefault:

Change the 5 to a 3 to move to an all text interface; something that is useful for nodes in a cluster.

Also, when doing imaging, make sure the /etc/hosts file references things by device and NOT by label; systemimager does not support labels.  This caused the swap file on imaged machines to not work (yet strangely, /boot and / still worked even though they were using labels.  Oh well).

Looks like I got all of the imaging kinks worked out, so now I'm going to try and finalize the image.

Oh yeah, you also have to *manually* edit the hosts file on the golden node.  Strangely, this is not done when creating a golden image.  It's not too hard though, just edit the /etc/hosts file to look similar to that of wolfhead.

First Machine being Imaged

Success!  I finally got a machine to image!  One problem I ran into was that these seemingly identical machines were not in fact identical.  So, the hard drive would sometimes show up as hda, and others as hdc.  Very annoying.  However, this was fixed by digging inside the machines and rewiring them.

One thing to note: the final instructions mention downloading something called PXE and compiling it and using that.  I found that to not work at all.  So, I just enabled the tftp server and put my pxelinux.0 file into the /tftpboot directory, as well as made the /tftpboot/pxeboot.cfg config directory (instructions can be found on the web; I just used 'default'), and copied the standard i386 kernel there (to be found on systemimager's sourceforge site).

Tuesday, June 20, 2006

More Imaging

Even this new imaging software has issues.  Man, documentation in the open source world is just horrid.  But anyway, here's how I got it to work:
use the RPMs to install the server software; follow these directions.  You may need to manually install the server RPM after installing the XML::Simple stuff by doing rpm -Uhv --nodep nameofserverstuff  -- Don't ask me why this is needed, it didn't make much sense to me either.

Now install the client software on the 'golden machine.'  Before initiating the golden machine you must install the platform-specific initrd scripts, found here on the golden machine.  They don't tell you this anywhere.

This is where I stand.

SLIM Issues

So, SLIM had lots of issues.  It assumed lots of things that it shouldn't, and was broken in many areas.  That's what you get when you use a piece of software that's not being actively worked on . . .

. . . so now I'm moving on to SystemImager, which I probably should have done to begin with.  It offers everything that SLIM does, and perhaps more, at the expense of being a bit more complicated.

Monday, June 19, 2006

SLIM

Just found an awesome imaging system that uses PXE.

SLIM

If this works, it'll make imaging a whole heck of a lot easier for us.

Network Booting

Taking a break from the ScaLAPACK stuff, as it is currently driving me insane.  Instead, I am now working on the PXE booting stuff.  I just successfully booted a linux kernel off of the network.  Hooray!  Here are some URLs that helped me along the way:

http://www.linux-sxs.org/internet_serving/pxeboot.html
http://dev.brantleyonline.com/wiki/index.php/PXE_Booting_-_Fedora_Core
http://dev.brantleyonline.com/wiki/index.php/General_Network_%28PXE%29_Booting

Now let's figure out how to bring across an image . . .

Friday, June 16, 2006

ScaLAPACK Issues

Many of the ScaLAPACK tests are failing, and I have no idea why, and the ScaLAPACK 'troubleshooting' resources are extremely lacking.  There is a binary available, so I'm going to download that to see if the tests work on that build.

ScaLAPACK

Just build ScaLAPACK and am now building the testing tools.  Compiling this was a bit hairy due to a few issues talked about here:
http://math-atlas.sourceforge.net/errata.html#LINK
Point 1: http://www.lam-mpi.org/MailArchives/lam/2005/04/10330.php

Also, getting this line right in SLMake.inc (in ScaLAPACK) is important: BLASLIB       = /usr/local/ATLAS/lib/Linux_PIIISSE1/liblapack.a /usr/local/ATLAS
/lib/Linux_PIIISSE1/libf77blas.a /usr/local/ATLAS/lib/Linux_PIIISSE1/libcblas.a
/usr/local/ATLAS/lib/Linux_PIIISSE1/libatlas.a

That is also referenced in Point 1 of the post above.

Getting close to getting this up and running . . .

Setting up MPI to use the proper NIC

You have to edit /usr/local/etc/openmpi-mca-params.conf and put in:
btl_tcp_if_include = eth1
btl_tcp_if_exclude = lo, eth0

Compiled BLACS and other things this morning.  That software can be a little tricky to install . . . now I'm about to test it.

Thursday, June 15, 2006

RScaLAPACK

I am now going through the process of putting RScaLAPACK on the head node.  This is no easy, quick process: RScaLAPACK depends on R and ScaLAPACK, ScaLAPACK depends on LAPACK and some other stuff . . . and so forth.  It gets really hairy.  This image explains the setup:


In addition, I am installing ATLAS, which supposedly does some of the BLAS and LAPACK routine installation.  We'll find out.

Blogger Widget

To facilitate easy updating of this blog, Google (who owns blogger/blogspot) release a Mac OS X widget to do such a thing.  Now, posting updates here is quite easy (in fact, I posted this using the widget).

Wednesday, June 14, 2006

DHCP stuff

This afternoon I set up a DHCP server on wolfhead.


  • Installed dhcpd from FC5 CD
  • Edited /etc/sysconfig/dhcpd to DHCPDARGS=eth1
  • Changed /etc/dhcpd.conf to:
ddns-update-style interim;
ignore client-updates;

subnet 192.168.0.0 netmask 255.255.255.0 {

# --- default gateway
option routers 192.168.0.254;
option subnet-mask 255.255.255.0;

option domain-name-servers 130.71.128.8;

option time-offset -18000; # Eastern Standard Time
# option ntp-servers 192.168.1.1;
# option netbios-name-servers 192.168.1.1;
# --- Selects point-to-point node (default is hybrid). Don't change this unless
# -- you understand Netbios very well
# option netbios-node-type 2;

range 192.168.0.128 192.168.0.252;
default-lease-time 21600;
max-lease-time 43200;

use-host-decl-names on;

host wolfhead {
hardware ethernet 00:03:47:9D:B2:76;
fixed-address 192.168.0.254;
}
host rockhead {
hardware ethernet 00:03:47:9D:B2:77;
fixed-address 192.168.0.253;
}
host wolf001 {
hardware ethernet 00:03:47:9B:93:82;
fixed-address 192.168.0.1;
}
host wolf002 {
hardware ethernet 00:03:47:9D:B2:7D;
fixed-address 192.168.0.2;
}
host wolf003 {
hardware ethernet 00:03:47:9C:99:59;
fixed-address 192.168.0.3;
}
host wolf004 {
hardware ethernet 00:03:47:9C:C2:E9;
fixed-address 192.168.0.4;
}
host wolf005 {
hardware ethernet 00:03:47:9C:AB:3B;
fixed-address 192.168.0.5;
}
}

  • Set up some of the clients to use DHCP instead of static
And perhaps some other stuff which has escaped my mind.