Diskless cluster and Salt

StepByStep

Preparation of the master node

Install VirtualBox on host machine, and install VirtualBox Extension Pack.

Create a CD/DVD with a CentOS-7-mininal ISO, and use this to create the virtual instance of what will be the "master" (to be called gandalf ) in the cluster environment. Turn off kdump during the installation process. Or, just pick VirtualBox appliance of the minimal install and import it.

Ensure network is up (if this has not been done during the installation); ONBOOT=yes in relevant line in network configuration file (ifcfg-enp0s3 in my case), and systemctl restart network. Do yum update and reboot the instance once the update is completed. Remove the old kernel (yum erase kernel-3.10.0-327.el7.x86_64 in my case). If behind a firewall, it might be necessary to put the line:

proxy=http://xxx.xxx.xx:8080 https://xxx.xxx.xx:8080
in /etc/yum.conf (with the x:es substituted with relevant values).

Might also be good to install guest additions for VirtualBox (in which case bzip2, kernel-devel, gcc needs to be installed).

Also, create a second intereface in the management interface for VirtualBox, chose Host-Only here. For it to work properly, an interface on the host machine needs to be created, give it the address 192.168.50.253 (this is done under "File/Preferences/Network" in the interface).

Run

grubby --update-kernel=ALL --args=net.ifnames=0
which forces the old "nic-naming scheme" to be used. Also modify the relevant "ifcfg-files" under /etc/sysconfig/network-scripts accordingly (both content and names) if necessary, and reboot to ensure this is working properly.

Create/modify /etc/sysconfig/network-scripts/ifcfg-eth1:

ONBOOT=yes
BOOTPROTO=static
IPADDR=192.168.50.254
TYPE=Ethernet
NAME=eth1
DEVICE=eth1
Reboot and check that the interface comes up OK. If this is working properly it is possible to reach the system from the host machine through ssh root@192.168.50.254.

Once the instance is up again, set up the necessary salt-repositories according to this link from saltstack.com, but don't install anything except for the repo at this point.

Add the line

192.168.50.254          gandalf salt
to /etc/hosts.

Some of the packages used are "epel-based", set up the necessary repositories by yum install epel-release.

Also, in this setup we are using some extensions to torque; a valid "project name" is required in order to submit jobs (see the file /etc/sysconfig/approved_projects to be installed by salt below), and access to the nodes are only granted to users having recieved resources from torque. The files required to achieve this are stored under a "non standard" directory, which have to be created manually:

mkdir -p /usr/local/opt/accounting_torque
Not putting this in salt serves as a remainder that this could be handled in a better way...

Install the salt-minion, yum install salt-minion and ensure that the minion is started during the boot process: systemctl enable salt-minion. Note that on the master (gandalf) even salt-master should be installed, but this is done at a later stage.

Since salt-master should not be installed on the other systems in the environment it is convinient at this stage to create the base of the root file system.

Execute the sync with:

mkdir -p /netNodes/templ
yum install rsync
yum clean all
rsync -av --progress --exclude="/proc/*" --exclude="/sys/*" --exclude=/netNodes / /netNodes/templ/.
The files created by the last command above represents the root file system for a minimal CentOS-7 installation (modified with the adjustments done above). In the context of the process outlined here, it should be considered a template, not to be modified.

Now stuff specific to the master can be handled, e.g., rename it "gandalf", hostnamectl set-hostname gandalf.

Install salt-master, yum install salt-master and start and enable the service if necessary:

systemctl start salt-master
systemctl enable salt-master
Add the key: salt-key -a gandalf (if gandalf is not found, add the line id: gandalf in /etc/salt/minion and run systemctl restart salt-minion) and check that the connections are working: salt '*' test.ping.

At this point, all remaining configuration should be handled by salt, create the directory /srv/saltstack and unpack the files in saltFiles.tar there.

Note that the kernel and ramdisk included in the tar file above are from a specific kernel, and if the root file system built above (by using the rsync command) does not correspond to the same version of the kernel, some modifications in the "salt-tree" have to be done:

The ramdisk can be built by: dracut -f /root/netboot7.img `uname -r`; for this to work IT SEEMS this command has to be preceded by:
yum groupinstall "NFS file server"
systemctl restart nfs
The same kernel as used on gandalf should be used (found under /boot). These changes are all made under the "salt-tree"; /srv/saltstack/salt/states/pxelinux. The new ramdisk and kernel should be placed under the directory files, and the m00? files as well as init.sls should be updated accordingly.

Before salt can be used, /etc/salt/master needs to be modified reflecting the structure used here. Change relevant lines to:

file_roots:
  base:
    - /srv/saltstack/salt
pillar_roots:
  base:
    - /srv/saltstack/pillar
and restart salt-master.

Here we are using maui as a scheduler, and it needs to be installed (use maui-3.3.1). This require that two things, (i) yum install gcc and (ii) an installation of torque - salt-call state.sls states.torque. The last step is necessary for maui to know that it should use torque/pbs as resource manager. Also, it might be necessary to run salt-call saltutil.sync_all before states.torque.

Install maui by:

./configure
make
make install

Now, by the command

salt 'gandalf' state.highstate
all packages and all configurations required to set up the environment for disk less boots should be installed on gandalf.

Note that the torque-server won't start, this is going to be fixed below.

Creation of the root file system for the nodes

Again salt will be used to build up the root file system for the nodes, and the steps we are about to follow are:

The points above are described in more detail below.

Make a copy of the "template root file system" (note, in the example presented here netNodes/templ is not used after netNodes/m001 has been created - but in a more realistic case it would be useful to keep it in order to build "template root file systems" for computers in other roles):

cat /dev/null > /netNodes/templ/etc/hostname
rm /netNodes/templ/etc/salt/minion_id
rm /netNodes/templ/etc/salt/pki/minion/*
cp -a /netNodes/templ /netNodes/m001
and modify /netNodes/m001/etc/fstab:
192.168.50.254:/netNodes/m001/  /               nfs     defaults        0 0
none                            /tmp            tmpfs   defaults        0 0
tmpfs                           /dev/shm        tmpfs   defaults        0 0
sysfs                           /sys            sysfs   defaults        0 0
proc                            /proc           proc    defaults        0 0
Bring up the virtual node - m001 - through the interface of VirtualBox. Ensure that it boots over the network (in the VirualBox interface), and that it is equipped with a Host-Only interface. Furthermore, the mac-address of the Host-Only interface should be written at the relevant place in the dhcpd.conf file in the salt-structure - and obviously salt '*' state.highstate should be run in order to effectuate this change. The node should also have a NAT network adapter, in order to ensure that the node can get access to the external repos (but the Host_Only interface should be the first). I experienced problem with the naming of the node (m001 in this case), this should be handled through dhcp, but failed to work on VirtualBox-5.0.16. Since this is only an issue once the root file tree is created (when the node needs access to external repositories), this was solved by hostnamectl set-hostname m001.

Once the m001 is up salt-key -a m001 on the master should bring it into salt. A salt 'm001*' state.highstate should finalize the set up of it. Once this is done, node does not any longer need acces to external repositories, and the second network can be removed from the VirtualBox configuration. Shut the node down, and do:

Again, ensure the correct mac-address for m002 is listed in dhcpd.conf (using salt), and that the instance is configured to boot over the Host-Only interface.

In the process of bringing up m002 I experienced some problems with the name of the node. Somehow the name of m001 was lingering, and I had to resolve it by explicitly setting the name; hostnamectl set-hostname m002. This should be resolved in a better way - this was probably fixed by cleaning out templ_merry/etc/hostname (a fix introduced later).

Observe that at this point ONLY the Host-Only interface is necessary, at this point there should be no need to reach the repositories. IF there is need to for instance install some other packages, it shouled be done on one of the nodes, and the template (templ_merry) should be updated according to the procedure outlined above.

Salt could be useful for submitting commands (even though at this stage it wont be used for installing stuff, as pointed out above), so do a salt-key -a m002.

Initialize torque by pbs_server -t create (and explicitly kill the pbs_server-process after some seconds). Initialize server and queue by running qmgr < torque.conf where torque.conf has the content:

###################################################################################
# Create and define queue merry
#
create queue merry
set queue merry queue_type = Execution
set queue merry enabled = True
set queue merry started = True
#
# Set server attributes.
#
set server scheduling = True
set server acl_hosts = gandalf
set server log_events = 511
set server mail_from = adm
set server query_other_jobs = True
set server scheduler_iteration = 600
set server node_check_rate = 150
set server tcp_timeout = 6
set server next_job_number = 1
###################################################################################
Bring up the nodes by starting qmgr and run:
create node m001
create node m002
active node m001
active node m002

Once this is done, restart pbs_server and maui:
systemctl restart pbs_server
skill maui
/usr/local/maui/sbin/maui