Diskless cluster and Salt

Structure and some salt implementations

The Salt implementation used here relies on giving different computers different roles. Obviously I wanted to supply the definitions of the roles at only one place, and initially I tried to use pillar - but since I also wanted the role based pillar to define other pillar, and that does not work - I opted to instead use grains.

However, recently, the contribution in Pillarstack have made it possible to actually use pillar defined roles in the way described above. But at the time when Pillarstack was announced the grain based solution were already working. However, a pillar based solution (e.g., Pillarstack) would probably be preferable if the information shared is sensitive.

In order to explain the layout used, the example below is provided. Note however that in Step by step a much more limited example is used.

Struct:
  ----------
  WS:
      ----------
      bree:
          - butterbur
          - grof
      shire:
          - cotton
          - sam
  hall:
      ----------
      cluster:
          ----------
          merry:  
              - m001
              - m002
          pippin: 
              - p001
              - p002
      master:
          - gandalf
      sharedMem:  
          ----------
          forest: 
              - thranduil
              - galadriel
          riven:  
              - elrond
      storage:
          ----------
          stor1:  
              - dain01
              - dain02
          stor2:  
              - gloin01
              - gloin02

The dictionary "Struct" contains other dictionaries, and at "the bottom" there are lists of the actual computers. In this example WS signifies "WorkStations", bree and shire are separate groups, and butterbur, grof, cotton and sam are the machines used by the persons in the different groups.

Similar holds for e.g., the machine m001; it belongs to the cluster merry which is placed in hall. Here only two nodes are listed in each cluster (merry and pippin) - in reality there would generally be many more nodes.

The remaining nodes follow the same principle; master signifies the Salt master, forest and riven are different types of multi user shared memory machines (e.g., mesh generation and CAD), stor1 and stor2 are separate "storage parallel file systems". All this are also placed in hall (as opposed to the workstations which are not placed in the computer hall, and hence does not have the role hall).

The purpose of all this is to be able to "lump together" functionality within the different roles, e.g., all machines placed in hall can have common requirements, as is the same for all storage nodes, and all sharedMem machines belonging to the group forest, etc.

Now, since Salt is basically "machine based", the information supplied in Stuct has to be transformed to a "roles" list, e.g.,:

In my implementation this is handled by the script my_grains.py in the directory salt/_grains. Note that if changes are made in this script, the command: salt '*' saltutil.sync_grains effectuates these changes on all the machines.

Having put some effort in designing the structure, and the roles, I have tried to follow the following principles:

Follow a similar structure for pillar:

Again, in the actual example provided in the tar-file supplied, the structure is not as extensive as the one given above.

In the Salt-structure used the least straightforward are the ones handling:

Will describe these in some detail. The remaining stuff should be fairly easy to follow.

Users

I have defined all user data as pillars, and they are defined according to the roles. In this example I have NOT considered the case that users are defined in several roles (a reasonable strategy would be to merge the users from the different roles in which they are defined). So in my case users should only be defined in ONE of the roles for each computer.

Since the users are defined as pillars, the state that actually does the work and add the users to all machines are done with the glob '*', i.e., if no users are defined (in pillar) for a certain machine no users are added to that particular machine.

All users are defined under pillar/pillars/users. In many (most?) cases it is convenient to group different users together, and here this is done under the same directory. As an example, john.sls:

base_john: &base_john
  john:
    name: john
    fullname: John John
    email: john@lorien.com
    uid: 501
    primary_group: clerk
    shell: /bin/bash
    groups:
      - archery
      - mining
    shadow: $6$adas$clwBNkon2kqggFWInRUrBbg/ZJ1
    sshpriv: |
      -----BEGIN RSA PRIVATE KEY-----
      MIIEogIBAAKCAQEAy8tcwcWhOkV2AmgA+AjXUWOQq7fMKZsFonItmp8qD/cqZKzU
      SW+8Pi7G8Xy1XhtIXiOOZpsE6tX/HS3D67Ko5FUbJyYO1FiiJQcX2UEVU/8gY8UE
      -----END RSA PRIVATE KEY-----
    sshpub: 
      ssh-rsa AAAABAQDLy1zBxaE6LsbxfLVeGrFt+594diCKd+7F3JqQaOrZp
Records of this type is then used to group together different users, e.g., dept01.sls:
# Found this way of doing it on stackoverflow, by mway
{% include 'pillars/users/john.sls' %}
{% include 'pillars/users/ann.sls' %}
users:
  <<: *base_john
  <<: *base_ann
where "john" and "ann" are put together as a unit to be added to suitable machines.

The ssh-keys should (obviously) be different for the different users.

Host file

I also wanted some flexibility in the handling of /etc/hosts. One way would be to use different static files, defined to suit the corresponding machines. However, that approach is not overly flexible, and I instead opted to use the "host" state in Salt. Together with two set of pillars host_add and net_data this provides a flexible (?) way to build up /etc/hosts on the different computers. In my case host_add lists the nodes which should be added, and net_data provides abstracted data for groups of computers (e.g., pippin, merry, stor1, riven, etc.).

Repositories

Here I have used the Salt state "pkgrepo", combined with pillars defined in pillar/pillars/repos, and the state salt/states/repos.

When working with local repositories it can happen that non local posts are added to /etc/yum.repos.d. In the code supplied some code have been added to actually remove all files in /etc/yum.repos.d not handled by salt.