Wandboard Cluster
The work done here contributes to the Massive Affordable Computing Project based out of the
University of Witwatersrand
and the
University of Cape Town
physics departments, both located in South Africa.
The purpose of this research is to explore ARM computing in the context of high energy physics such as that done at
CERN
. For more details and blogs about the project visit
http://hep.wits.ac.za/research/mac.php
.
This section describes select steps taken to configure a Wandboard Quad cluster from Freescale,
http://www.wandboard.org/
.
It is an ongoing project and so some sections are incomplete.
A good portion of the steps apply to Ubuntu 12.04 LTS, however Ubuntu 13.10 and
Archlinux ARM
were also used. Assume the steps are for 12.04 LTS unless otherwise specified.
Glusterfs
Glusterfs enables you to share a filesystem over the server and nodes.
From the server (MAC1), as root do:
-
apt-get install glusterfs-server
-
gluster volume create UCTMAC_Volume serverIPaddress:/data
this creates a file called
data
Then start glusterfs
-
gluster volume start UCTMAC_Volume
-
gluster volume info
To allow nodes access:
-
gluster volume set UCTMAC_Volume auth.allow nodeIP1,nodeIP2
Then on the nodes, as root do:
-
mkdir /mnt/glusterfs
-
mount.glusterfs ServerIPAddress:/UCTMAC_Volume /mnt/glusterfs
-
mount
A simple
df -h
should show that there is a filesystem there.
Since I am using a USB as my file-sharing device I will mount it from the server in /data.
Mpich
I installed the latest version of MPICH from
here
(currently 3.03). Follow the readme files on how to install, it's pretty straight forward. Remember
--enable-shared
and
--prefix=
to your shared filesystem in configuration. Add the /bin folder in the mpich installation into your path. After following the readme file do
which mpicc
. It should point to the new location.
Benchmarks so far
Get the tar file from their website. untar it etc. In the new directory create a file corresponding to your system, ie, archlinux.
Then do
-
sudo make XCFLAGS=" -DMULTITHREAD=4 -DUSE_FORK=1" PORT_DIR=thedirectory/
If compiling in Archlinux using
-DUSE_PTHREAD
add
-lpthread
Play around with some of the flags. More flags can be added in the new folders .mak file.
HPL
Follow INSTALL file. Also need to do
sudo apt-get install gcc-multilib
. I found if a compile failed with errors, copy the Make.
outside the top hpl directory. Delete the whole directory, unzip a new one, paste the Make file in there and try again. Need the atlas-base files. A sudo apt-get install libatlas-base-dev
which install libcblas.a and libatlas.a in /usr/lib/atlas-base
For Archlinux:
There are no atlas base files in the pacman repository. You have to build it yourself. Get the source at the bottom of this page, untar it and create the file test
in the new untarred folder. Cd to test
and do
-
../configure -Si archdef 0
-
make build
-
make check
If there are cpu throttling errors, change the values in /sys/devices/system/cpu/cpu#/cpufreq/scaling_governor
. If the problem persists and you are sure you have turned off cpu throttling add the flag -Si cputhrchk 0
to the configure step.
These could take a long time. The libatlas.a and libcblas.a files will be in the test/include
or test/lib
directories.
PARKBENCH
Unzip tarball. Make changes in make.local.def
. Then make all.mpi
STREAM
Stream 5.10 files can be found at http://www.cs.virginia.edu/stream/FTP/Code/
.
Correct the fortran and C compilers in the Makefile if necessary and compile the code with make
. Change matrix sizes appropriately. There isn't any need to run it in parallel on other nodes because it should directly scale.
Parallel Memory Bandwidth Benchmark (pmbw)
Get the files from here
. Instructions are clear to follow from site.
Setting up "ssh-able" IP
Need to register your computers mac address and the port depending on the rules of your institution. Then in /etc/network/interfaces
:
auto lo eth1
iface lo inet loopback
iface eth1 inet static
address xxx
netmask xxx
gateway xxx
broadcast xxx
dns-nameservers xxx
If given time server info put it in /etc/openntpd/ntpd.conf
.
To change default ethernet port go to /etc/ssh/sshd_config
. In there change port 22 to whatever you like.
To be able to ssh without having to fill in a port number each time create a file ~/.ssh/config
. In there for each host/node do
host _name_
hostname _ip address_
port _port number_
...
Installing ROOT on Ubuntu and Archlinux
For Ubuntu
The USB must be ext4 partitioned. Mount the usb in /media/usb/
,
-
sudo mount /dev/sdb1 /media/usb
(to unmount sudo umount /media/usb
, or more forcibly, umount -f -l /mnt
)
Unzip the root tarball onto the usb drive. Then, as root user do
-
./configure --prefix=/data/ROOT/root_install --etcdir=/data/ROOT/root_install --enable-explicitlink --enable-rpath --enable-soversion --all
-
make
-
make install
In your .bashrc file add the line
-
source /data/ROOT/root/bin/thisroot.sh
For Arch Linux.
To enable pyroot you have to change the name of the executable in the /usr/bin from python2
to python
:
To enable Cintex you have to patch your code so that it will work on the ARM architecture. Although Cintex is not crucial, if you want to install ATHENA later on you have to have it (see further down).
Get the patch from here
, or use my slightly modified one attached to this page. Paste it in your root folder and do
-
sudo patch < root-5.34.05-cintex-armv7a-port.patch
There may be errors but as long as something updates it should be ok.
Needed packages for packman are: libxpm, libxft. There are also a couple optionals like mysql.
Then configure with
-
./configure linuxarm --prefix=/home/archlinux/ROOT/Root_Install
-
sudo make
-
sudo make install
If there is a "freetype/freetype.h" error which there probably will be, cd to /usr/include/freetype2
and create a softlink pointing a new file freetype, back to the freetype2 folder which has freetype.h in it:
-
sudo ln -s /usr/include/freetype2/ freetype
Once configure has finished, look at the bottom of the configure output and see if python and cintex are there.
A dedicated Proof Cluster
This applies to Ubuntu 14.04. We have an ARM cluster of 7/8 wandboards sharing a filesystem through an NFS server.
Overview: You will need to install ROOT as well as xrootd. Xrootd will be installed directly from resources provided in the ROOT download once it is untarred.
Once ROOT and xrootd are installed, you start the xrootd daemon by using the command to start the xproofd daemon (which has been installed automatically). Because of the way the configure files for xrootd are set up, and the way you initialise the xproofd daemon, it automatically starts the xrootd daemon. I know... confusing.
Everything was put in a shared directory (/home/josh) which makes life easier. Make sure each node can access each other without requiring passwords:
Doing it for one node should be ok since they share the directory.
Then, on all boards do
sudo apt-get install git dpkg-dev make g++ gcc binutils libx11-dev libxpm-dev libxft-dev libxext-dev python-dev gfortran libboost-all-dev
(I recommend using cluster ssh for this.)
Once ROOT has been untarred go into build/unix/
and do
-
installXrootd.sh /home/josh/xrootd
Now you need to build ROOT.
Here is the full configure command I used:
./configure linuxarm --with-xrootd-incdir=/home/josh/xrootd/xrootd-3.2.7/include/xrootd --with-xrootd-libdir=/home/josh/xrootd/xrootd-3.2.7/lib \
--with-x11-libdir=/usr/lib/arm-linux-gnueabihf --with-xpm-libdir=/usr/lib/arm-linux-gnueabihf --with-xft-libdir=/usr/lib/arm-linux-gnueabihf \
--with-xext-libdir=/usr/lib/arm-linux-gnueabihf --with-python-libdir=/usr/lib/arm-linux-gnueabihf --prefix=/home/josh/ROOT/root-install \
--etcdir=/home/josh/ROOT/root-install/ --enable-rpath --enable-explicitlink --enable-soversion --all
Then,
-
make && sudo make install
Add this to your .bashrc:
source /home/josh/ROOT/root-install/bin/thisroot.sh
source /home/josh/ROOT/root-install/bin/setxrd.sh /home/josh/xrootd/xrootd-3.2.7/
export PROOFCONFIG=/home/josh/ROOT/root-install/proof
Now go into ROOT/root-install/proof
. The two files we need are xpd.cf
and proof.conf
. I've provided mine at the bottom of the page, but there are also samples in the directory. Once configured to your liking (goodluck) do:
-
xproofd -c xpd.cf -b -l /tmp/xpd.log
on the master and all the workers (use cssh). This is a shared location so it's the same place for all boards but each need to have the daemon started on them. The /tmp
location is not shared and so each board will have it's own xproofd/xrootd log file stored there.
To start your proof cluster do
-
TProof *p = TProof::Open("wanboard1)
-
p.Print(“a”)
To run benchmarks do
-
gROOT->Time()
This initializes a timer of any command entered.
-
TProofBench pb("wanboard1")
-
pb.SetDebug(kTRUE)
This saves more results into the ROOT tree
For memory benchmarks you have:
-
pb.RunCPU()
-
pb.RunCPUx()
-
pb.GetPerfSpecs()
(after running one of the memory tests)
I/O benchmarks:
-
pb.MakeDataSet()
(essential)
-
pb.RunDataSet()
-
pb.RunDataSetx()
Each command can take a lot of arguments. So go look here
if you want to use something other than defaults. The title about "Getting the performance specs" is helpful.
Just a note, a prooflite session is very easy to start, no setup is neccesary other than installing ROOT. Once that is done then:
To start a PROOF Lite process open ROOT and type
-
TProof *plite = TProof::Open("lite://")
Or
-
TProofBench pb("lite://")
then to specify number of workers if different from the default:
-
TProof *proof = TProof::Open("workers=10")
Running Proof on lxplus worker
Log into lxplus. Then source a version of root:
-
source /afs/cern.ch/sw/lcg/app/releases/ROOT/5.34.18/x86_64-slc6-gcc48-opt/root/bin/thisroot.sh
Source that gcc version:
-
source /afs/cern.ch/sw/lcg/contrib/gcc/4.8.1/x86_64-slc6-gcc48-opt/setup.sh
Start xrootd daemon:
-
source /afs/cern.ch/sw/lcg/app/releases/ROOT/5.34.18/x86_64-slc6-gcc48-opt/root/bin/setxrd.sh /afs/cern.ch/sw/lcg/external/xrootd/3.2.7/x86_64-slc6-gcc48-opt/
Making SSD rootfs without Serial Cable
Create a tar file of the rootfs on the wandboard. As root in ~/
-
tar -zcvf Archlinux-rootfs.tar.gz *
Mount the newly formatted SSD to a directory and unzip tar.gz file into it (make sure it's ext4):
-
sudo mount /dev/sda /SATA
-
sudo tar -zxvf Archlinux-rootfs.tar.gz -C /SATA
A good source is [[http://archlinuxarm.org/forum/viewtopic.php?f=45&t=5995][here]. This may be a more clear explanation.
Get the u-boot package:
-
sudo pacman -S uboot-wandboard-quad
Then in /boot
create the file uEnv.txt
. Copy and paste this into it
baudrate=115200
boot_fdt=no
bootcmd=mmc dev ${mmcdev};if mmc rescan; then echo SD/MMC found on device ${mmcdev};if run loadbootenv; then run importbootenv;fi;echo Checking if uenvcmd is set ...;if test -n $uenvcmd; then echo Running uenvcmd ...;run uenvcmd;fi;echo Running default loaduimage ...;if run loaduimage; then run mmcboot;fi;fi;
bootdelay=1
bootscript=echo Running bootscript from mmc ...; source
console=ttymxc0
ethact=FEC
ethaddr=00:1f:7b:b0:00:04
ethprime=FEC
fdt_addr=0x11000000
fdt_file=/boot/dtbs/imx6s-wandboard.dtb
fdt_high=0xffffffff
importbootenv=echo Importing environment from mmc (uEnv.txt)...; env import -t $loadaddr $filesize
initrd_high=0xffffffff
ip_dyn=yes
loadaddr=0x12000000
loadbootenv=load mmc ${mmcdev}:${mmcpart} ${loadaddr} /boot/uEnv.txt
loadbootscript=load mmc ${mmcdev}:${mmcpart} ${loadaddr} ${script};
loadfdt=load mmc ${mmcdev}:${mmcpart} ${fdt_addr} ${fdt_file}
loaduimage=load mmc ${mmcdev}:${mmcpart} ${loadaddr} ${uimage}
mmcargs=setenv bootargs console=${console},${baudrate} ${optargs} root=${mmcroot} rootfstype=${mmcrootfstype} video=${video}
mmcboot=echo Booting from mmc ...; run mmcargs; if test ${boot_fdt} = yes || test ${boot_fdt} = try; then if run loadfdt; then bootz ${loadaddr} - ${fdt_addr}; else if test ${boot_fdt} = try; then bootz ${loadaddr}; else echo WARN: Cannot load the DT; fi; fi; else bootm ${loadaddr}; fi;
mmcdev=0
mmcpart=1
mmcroot=/dev/sda1 rw
mmcrootfstype=ext4 rootwait
netargs=setenv bootargs console=${console},${baudrate} root=/dev/nfs ip=dhcp nfsroot=${serverip}:${nfsroot},v3,tcp
netboot=echo Booting from net ...; run netargs; if test ${ip_dyn} = yes; then setenv get_cmd dhcp; else setenv get_cmd tftp; fi; ${get_cmd} zImage; if test ${boot_fdt} = yes || test ${boot_fdt} = try; then if ${get_cmd} ${fdt_addr} ${fdt_file}; then bootz ${loadaddr} - ${fdt_addr}; else if test ${boot_fdt} = try; then bootz ${loadaddr}; else echo WARN: Cannot load the DT; fi; fi; else bootz ${loadaddr}; fi;
script=/boot/boot.scr
uimage=/boot/uImage
update_sd_firmware=if test ${ip_dyn} = yes; then setenv get_cmd dhcp; else setenv get_cmd tftp; fi; if mmc dev ${mmcdev}; then if ${get_cmd} ${update_sd_firmware_filename}; then setexpr fw_sz ${filesize} / 0x200; setexpr fw_sz ${fw_sz} + 1; mmc write ${loadaddr} 0x2 ${fw_sz}; fi; fi
update_sd_firmware_filename=u-boot-quad.imx
Restart the wandboard and try
It should say your root folder is now /dev/sda1
AFS
What follows applies to Arch Linux. Get the header files, sudo pacman -S linux-wandboard-headers
which is here
Get an OpenAfs source tar from here
. Unzip it cd
into the new folder. Then
-
sudo ./configure --enable-transarc-paths --enable-checking --enable-debug --with-linux-kernel-headers=/usr/src/linux-3.0.35-5-ARCH/
-
make
-
make dest
-
cd /dest
-
sudo mkdir /usr/afs
-
sudo mkdir /usr/vice
-
sudo mkdir /usr/vice/etc
-
sudo cp -p -r root.server/usr/afs/* /usr/afs
-
sudo cp -p -r root.client/usr/vice/etc/* /usr/vice/etc
-
sudo mkdir /usr/vice/cache
Copy all the files from an lxplus machine in /usr/vice/etc
to /usr/local/etc/openafs
. (This was just easier for me). Change the memory allocation in cacheinfo
. I have it set to 1000000.
Then copy the lxplus /etc/krb5.conf
file to /etc/
on the wandboard. Then in /usr/vice/etc/
do
-
sudo ./afs.rc start
-
sudo afsd -verbose -debug
Now do kinit afsusername@CERN.CH
(note capitals) and fill in cern password. If no error occurs, success! Do klist
, should give something like:
Valid starting Expires Service principal
31/01/2014 13:13 01/02/2014 13:13 krbtgt/CERN.CH@CERN.CH
Now do aklog
then tokens
. This should output something similar to
Tokens held by the Cache Manager:
User's (AFS ID 51759) tokens for afs@cern.ch [Expires Mar 12 07:55]
--End of list--
That should give you permission to view afs in /afs
. You cannot cd to /afs
and ask it to list everything. I'm not sure why. Instead cd to /afs/cern.ch
and from there everything should be like normal afs.
To make sure the daemons begin at boot, go to /etc/systemd/system/
and create a file afs.service
. In it put
[Unit]
Description= Enable AFS Capabilities
[Service]
ExecStart=sudo /usr/vice/etc/afs.rc start
ExecStart=sudo /usr/vice/etc/afsd
[Install]
WantedBy=multi-user.target
Then do
-
sudo systemctl enable afs.service
LCG Software
This section has been moved to BuildAthenaArm
Gaudi
This section has been moved to BuildAthenaArm
Tips
Ubuntu 12.04LTS
To enable startup without gui:
-
cd to /etc/init/lightdm.conf
Comment out from
start on ((..
to just before
stop on .."
to change username and password:
To get ssh working:
-
sudo apt-get install openssh-server
Only way to restart ethernet port on wandboard it seems:
-
nohup sh -c "ifdown eth1 && ifup eth1"
After a flash of an image remember to change /etc/hosts
, /etc/hostname
and /etc/network/interfaces
accordingly else network configurations will not work. Also look in /etc/udev/rules/70-presistent-net.rules
. It may rename eth1
to eth2
which will cause problems. Delete the old MAC address and rename.
To fix the very annoying setlocale: LC_ALL: cannot change locale (en_US.UTF-8)
problem, as root do:
-
locale-gen en_US.UTF-8
-
dpkg-reconfigure locales
To execute startup script in ubuntu
A great place for a motd script (you can execute commands here)
Arch Linux
When first downloaded get the essential packages
-
sudo pacman -S base-devel
To allow X11 forwarding do :
Enable the AllowTcpForwarding option in sshd_config
on the server.
Enable the X11Forwarding option in sshd_config
on the server.
Set the X11DisplayOffset option in sshd_config
on the server to 10.
Enable the X11UseLocalhost option in sshd_config
on the server.
Then add systemctl start sshd
to /etc/profile
If you can only ping as root do
-
setcap cap_net_raw=ep /usr/bin/ping
To automatically login create a file /etc/systemd/system/getty@tty1.service.d/autologin.conf
in there put
[Service]
ExecStart=
ExecStart=-/usr/bin/agetty --autologin username --noclear %I 38400 linux
Internet settings in /etc/netctl
. Look in samples
and change eth0 correctly.
Useful screen commands
Create new window: ctrl-a ctrl c
Close current window ctrl-a X
Split-screen ctrl-a shift-|
(horizontal), ctrl-a shift-s
(vertical)
Move around split screens ctrl-a tab