papyri/md/rhcs_mechanics.md

# RHCS Mechanics

## Contents

  - [Acronyms](#acronyms)
  - [Configuration Files](#configuration-files)
  - [Filesystem Locations](#filesystem-locations)
  - [Operational Commands](#operational-commands)
  - [Cluster Components](#cluster-components)
  - [Operational Examples](#operational-examples)
      - [Configuration Validation](#configuration-validation)
      - [Status Check](#status-check)
      - [Service Manipulation](#service-manipulation)
  - [Configuration Examples](#configuration-examples)
      - [Standard LVM and PgSQL Initscript](#standard-lvm-and-pgsql-initscript)
      - [HA-LVM and MySQL Object](#ha-lvm-and-mysql-object)
      - [Standard LVM, MySQL script and NFS](#standard-lvm-mysql-script-and-nfs)
      - [HA-LVM and NFS Object](#ha-lvm-and-nfs-object)
  - [References](#references)


## Acronyms

  - **AIS**: Application Interface Specification
  - **AMF**: Availability Management Framework
  - **CCS**: Cluster Configuration System
  - **CLM**: Cluster Membership
  - **CLVM**: Cluster Logical Volume Manager
  - **CMAN**: Cluster Manager
  - **DLM**: Distributed Lock Manager
  - **GFS2**: Global File System 2
  - **GNDB**: Global Network Block Device
  - **STONITH**: Shoot The Other Node In The Head
  - **TOTEM**: Group communication algorithm for reliable group messaging among cluster members


## Configuration Files

  - `/etc/cluster/cluster.conf` - The main cluster configuration file
  - `/etc/lvm/lvm.conf` - The LVM configuration file - typically `locking_type` and a `filter` are being configured here


## Filesystem Locations

  - `/usr/share/cluster/` - The main directory of code used for cluster objects |
  - `/var/log/cluster/` - The main logging directory (**RHEL6**)


## Operational Commands

**Graphical Cluster Configuration**

  - `luci` - Cluster Management Web Interface primarily used with **RHEL6**
  - `system-config-cluster` - Cluster Management X11/Motif Interface primarily used with **RHEL5**

**RGManager** - Resource Group Manager

  - `clustat` - Command used to display the status of the cluster, including node membership and services running
  - `clusvcadm` - Command used to manually enable, disable, relocate, and restart user services in a cluster
  - `rg_test` - Debug and test services and resource ordering

**CCS** - Cluster Configuration System

  - `ccs_config_validate` - Verify a configuration; can validate the running config or a named file (**RHEL6**)
  - `ccs_config_dump` - Tool to generate XML output of running configuration (**RHEL6**)
  - `ccs_sync` - Synchronize the cluster configuration file to one or more machines in a cluster (**RHEL6**)
  - `ccs_update_schema` - Update the cluster relaxng schema that validates cluster.conf (**RHEL6**)
  - `ccs_test` - Diagnostic and testing command used to retrieve information from configuration files via **ccsd**
  - `ccs_tool` - Used to make online updates of CCS configuration files - **considered obsolete**

**CMAN** - Cluster Manager

  - `cman_tool` - The administrative front end to CMAN, starts and stops CMAN infrastructure and can perform changes
  - `group_tool` - Used to get a list of groups related to fencing, DLM, GFS, and getting debug information
  - `fence_XXXX` - Fence agent for XXXX type of device- for example `fence_drac` (Dell DRAC), `fence_ipmilan` (IPMI) and `fence_ilo` (HP iLO)
  - `fence_check` - Test the fence configuration for each node in the cluster
  - `fence_node` - A program which performs I/O fencing on a single node
  - `fence_tool` - A program to join and leave the fence domain
  - `dlm_tool` - Utility for the `dlm` and `dlm_controld` daemon
  - `gfs_control` - Utility for the gfs\_controld daemon

**GFS2** - Global File System 2

  - `mkfs.gfs2` - Creates a GFS2 file system on a storage device
  - `mount.gfs2` - Mount a GFS2 file system; normally not used by the user directly
  - `fsck.gfs2` - Repair an unmounted GFS2 file system
  - `gfs2_grow` - Grows a mounted GFS2 file system
  - `gfs2_jadd` - Adds journals to a mounted GFS2 file system
  - `gfs2_quota` - Manage quotas on a mounted GFS2 file system
  - `gfs2_tool` - Configures, tunes and gather information on a GFS2 file system

**Quorum Disk**

  - `mkqdisk` - Cluster Quorum Disk Utility


## Cluster Components

**RGManager** - Resource Group Manager

  - `rgmanager` - Daemon used to handle user service requests including service start, service disable, service relocate, and service restart; **RHEL6**
  - `clurgmgrd` - Daemon used to handle user service requests including service start, service disable, service relocate, and service restart; **RHEL5**
  - `cpglockd` - Utilizes the extended virtual synchrony features of Corosync to implement a simplistic, distributed lock server for rgmanager

**CLVM** - Cluster Logical Volume Manager

  - `clvmd` - The daemon that distributes LVM metadata updates around a cluster. Requires `cman` to be running first

**CCS** - Cluster Configuration System

  - `ricci` - CCS daemon running on all cluster nodes and provides configuration file data to cluster software; **RHEL6**
  - `ccsd` - CCS daemon running on all cluster nodes and provides configuration file data to cluster software; **RHEL5**

**CMAN** - Cluster Manager

  - `cman` - Cluster initscript used to start/stop all the CMAN daemons
  - `corosync` - Corosync cluster communications infrastructure daemon using TOTEM; **RHEL6**
  - `aisexec` - OpenAIS cluster communications infrastructure daemon using TOTEM; **RHEL5**
  - `fenced` - Fences cluster nodes that have failed (fencing generally means rebooting)
  - `dlm_controld` - Daemon that configures dlm according to cluster events
  - `gfs_controld` - Daemon that coordinates GFS mounts and recovery
  - `groupd` - Compatibility daemon for `fenced`, `dlm_controld` and `gfs_controld`
  - `qdiskd` - Talks to CMAN and provides a mechanism for determining node-fitness in a cluster environment
  - `cmannotifyd` - Talks to CMAN and provides a mechanism to notify external entities about cluster changes


## Operational Examples

The man pages for `clustat` and `clusvcadm` contain more in-depth explanations of all the shown options; more options exist than are shown here.

### Configuration Validation

As RHEL5 does not have the `ccs_config_validate` utility, an alternate method is possible to perform XML validation against the cluster schema instead:

```
xmllint --relaxng /usr/share/system-config-cluster/misc/cluster.ng /etc/cluster/cluster.conf
```

The well-formatted XML file and a final message about validation should be printed out when run.

### Status Check

Use the `clustat` command to check the cluster status:

```
# clustat
Cluster Status for cluster1 @ Fri Jan 17 16:49:45 2014
Member Status: Quorate

 Member Name                    ID   Status
 ------ ----                    ---- ------
 node1                          1    Online, Local, rgmanager
 node2                          2    Online, rgmanager

 Service Name                   Owner (Last)             State
 ------- ----                   ----- ------             -----
 service:pgsql-svc              node1                    started
```

### Service Manipulation

Use the `clusvcadm` command to manipulate the services:

```
# Restart PostgreSQL in place on the same server
clusvcadm -R pgsql-svc

# Relocate PostgreSQL to a specific node
clusvcadm -r pgsql-svc -m <node name>

# Disable PostgreSQL
clusvcadm -d pgsql-svc

# Enable PostgreSQL
clusvcadm -e pgsql-svc

# Freeze PostgreSQL on the current node
clusvcadm -Z pgsql-svc

# Unfreeze PostgreSQL after it was frozen
clusvcadm -U pgsql-svc
```


## Configuration Examples

### Standard LVM and PgSQL Initscript

This example uses a single standard LVM mount from SAN (as opposed to HA-LVM) and a normal initscript to start the service. An IP for a secondary backup network is included as well as the Dell DRAC fencing devices on the same VLAN.

```
/etc/hosts

127.0.0.1    localhost localhost.localdomain
10.11.12.10  pgdb1.example.com pgdb1
10.11.12.11  pgdb2.example.com pgdb2
10.11.12.20  pgdb1-drac
10.11.12.21  pgdb2-drac

/etc/cluster/cluster.conf

<?xml version="1.0"?>
<cluster config_version="18" name="pgdbclus1">
  <cman expected_votes="1" two_node="1"/>
  <fence_daemon post_fail_delay="5" post_join_delay="15"/>
  <clusternodes>
    <clusternode name="pgdb1" nodeid="1">
      <fence>
        <method name="drac">
          <device name="pgdb1-drac"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="pgdb2" nodeid="2">
      <fence>
        <method name="drac">
          <device name="pgdb2-drac"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <fencedevices>
    <fencedevice agent="fence_drac5" cmd_prompt="admin1" ipaddr="10.11.12.20" login="root" module_name="pgdb1" name="pgdb1-drac" passwd="calvin"/>
    <fencedevice agent="fence_drac5" cmd_prompt="admin1" ipaddr="10.11.12.21" login="root" module_name="pgdb2" name="pgdb2-drac" passwd="calvin"/>
  </fencedevices>
  <rm>
    <failoverdomains>
      <failoverdomain name="pgsql-fd" nofailback="1" restricted="1">
        <failoverdomainnode name="pgdb1"/>
        <failoverdomainnode name="pgdb2"/>
      </failoverdomain>
    </failoverdomains>
    <resources>
      <ip address="10.11.12.25" monitor_link="1" sleeptime="10"/>
      <ip address="10.9.8.7" monitor_link="0" sleeptime="5"/>
      <fs device="/dev/vgsan00/lvdata00" fsid="64301" fstype="ext4" mountpoint="/var/lib/pgsql/" name="pgsql-fs" options="noatime"/>
      <script file="/etc/init.d/postgresql-9.3" name="pgsql-srv"/>
    </resources>
    <service domain="pgsql-fd" name="pgsql-svc" recovery="relocate">
      <ip ref="10.11.12.25">
        <fs ref="pgsql-fs">
          <script ref="pgsql-srv"/>
        </fs>
      </ip>
      <ip ref="10.9.8.7"/>
    </service>
  </rm>
</cluster>
```


### HA-LVM and MySQL Object

This example uses a single HA-LVM mount from SAN (activated with exclusive locks) and a RHCS-provided service object to start MySQL. An IP for a secondary backup network is included as well as the Dell DRAC fencing devices on the same VLAN.

```
/etc/hosts

127.0.0.1    localhost localhost.localdomain
10.11.12.10  mydb1.example.com mydb1
10.11.12.11  mydb2.example.com mydb2
10.11.12.20  mydb1-drac
10.11.12.21  mydb2-drac

/etc/cluster/cluster.conf

<?xml version="1.0"?>
<cluster config_version="18" name="mydbclus1">
  <cman expected_votes="1" two_node="1"/>
  <fence_daemon post_fail_delay="5" post_join_delay="15"/>
  <clusternodes>
    <clusternode name="mydb1" nodeid="1">
      <fence>
        <method name="drac">
          <device name="mydb1-drac"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="mydb2" nodeid="2">
      <fence>
        <method name="drac">
          <device name="mydb2-drac"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <fencedevices>
    <fencedevice agent="fence_drac5" cmd_prompt="admin1" ipaddr="10.11.12.20" login="root" module_name="mydb1" name="mydb1-drac" passwd="calvin"/>
    <fencedevice agent="fence_drac5" cmd_prompt="admin1" ipaddr="10.11.12.21" login="root" module_name="mydb2" name="mydb2-drac" passwd="calvin"/>
  </fencedevices>
  <rm>
    <failoverdomains>
      <failoverdomain name="mysql-fd" nofailback="1" restricted="1">
        <failoverdomainnode name="mydb1"/>
        <failoverdomainnode name="mydb2"/>
      </failoverdomain>
    </failoverdomains>
    <resources>
      <ip address="10.11.12.25" monitor_link="1" sleeptime="10"/>
      <ip address="10.9.8.7" monitor_link="0" sleeptime="5"/>
      <lvm lv_name="data00" name="mysql-lv" vg_name="vgsan00"/>
      <fs device="/dev/vgsan00/lvdata00" force_fsck="0" force_unmount="0" fsid="64301" fstype="ext4" mountpoint="/var/lib/mysql/" name="mysql-fs" options="noatime" self_fence="0"/>
      <mysql config_file="/etc/my.cnf" listen_address="10.11.12.25" mysqld_options="" name="mysql" shutdown_wait="600"/>
    </resources>
    <service domain="mysql-fd" name="mysql-svc" recovery="relocate">
      <ip ref="10.11.12.25"/>
      <lvm ref="mysql-lv"/>
      <fs ref="mysql-fs"/>
      <mysql ref="mysql"/>
      <ip ref="10.9.8.7"/>
    </service>
  </rm>
</cluster>
```


### Standard LVM, MySQL script and NFS

This example uses a two standard LVM mounts from DAS, a MySQL initscript and RHCS-provided NFS objects to run two services. An IP for a secondary backup network is included for each, as well as the Dell DRAC fencing devices on the same VLAN. Notice that we're also adding preferred priority (where the service will run) to ask the cluster to self-balance from a cold-start, each node will run one service.

```
/etc/hosts

127.0.0.1    localhost localhost.localdomain
10.11.12.10  node1.example.com node1
10.11.12.11  node2.example.com node2
10.11.12.20  node1-drac
10.11.12.21  node2-drac

/etc/cluster/cluster.conf

<?xml version="1.0"?>
<cluster config_version="18" name="cluster1">
  <cman expected_votes="1" two_node="1"/>
  <fence_daemon post_fail_delay="5" post_join_delay="15"/>
  <clusternodes>
    <clusternode name="node1" nodeid="1" votes="1">
      <fence>
        <method name="drac">
          <device modulename="" name="node1-drac"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="node2" nodeid="2" votes="1">
      <fence>
        <method name="drac">
          <device modulename="" name="node2-drac"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <fencedevices>
    <fencedevice agent="fence_drac5" cmd_prompt="admin1" ipaddr="10.11.12.20" login="root" name="node1-drac" passwd="calvin"/>
    <fencedevice agent="fence_drac5" cmd_prompt="admin1" ipaddr="10.11.12.21" login="root" name="node2-drac" passwd="calvin"/>
  </fencedevices>
  <rm log_facility="local4" log_level="7">
    <failoverdomains>
      <failoverdomain name="mysql-fd" ordered="1" restricted="1">
        <failoverdomainnode name="node1" priority="1"/>
        <failoverdomainnode name="node2" priority="2"/>
      </failoverdomain>
      <failoverdomain name="nfs-fd" ordered="1" restricted="1">
        <failoverdomainnode name="node1" priority="2"/>
        <failoverdomainnode name="node2" priority="1"/>
      </failoverdomain>
    </failoverdomains>
    <resources>
      <ip address="10.11.12.25" monitor_link="1" sleeptime="10"/>
      <ip address="10.11.12.26" monitor_link="1" sleeptime="10"/>
      <ip address="10.9.8.7" monitor_link="0" sleeptime="5"/>
      <ip address="10.9.8.6" monitor_link="0" sleeptime="5"/>
      <fs device="/dev/vgdas00/mysql00" force_fsck="0" force_unmount="0" fsid="14404" fstype="ext3" mountpoint="/das/mysql-fs" name="mysql-fs" options="noatime" self_fence="0"/>
      <fs device="/dev/vgdas01/nfs00" force_fsck="0" force_unmount="1" fsid="31490" fstype="ext3" mountpoint="/das/nfs-fs" name="nfs-fs" options="noatime" self_fence="0"/>
      <script file="/etc/init.d/mysqld" name="mysql-script"/>
      <nfsexport name="nfs-res"/>
      <nfsclient name="nfs-export" options="rw,no_root_squash" path="/das/nfs-fs" target="10.11.12.0/24"/>
    </resources>
    <service autostart="1" domain="mysql-fd" name="mysql-svc">
      <ip ref="10.11.12.25">
        <fs ref="mysql-fs">
          <script ref="mysql-script"/>
        </fs>
      </ip>
      <ip ref="10.9.8.7"/>
    </service>
    <service autostart="1" domain="nfs-fd" name="nfs-svc" nfslock="1">
      <ip ref="10.11.12.26">
        <fs ref="nfs-fs">
          <nfsexport ref="nfs-res">
            <nfsclient ref="nfs-export"/>
          </nfsexport>
        </fs>
      </ip>
      <ip ref="10.9.8.6"/>
    </service>
  </rm>
</cluster>
```


### HA-LVM and NFS Object

This example uses a single HA-LVM mount from SAN (activated with exclusive locks) and a RHCS-provided service object to start NFS. An IP for a secondary backup network is included as well as the Dell DRAC fencing devices on the same VLAN.

```
/etc/hosts

127.0.0.1    localhost localhost.localdomain
10.11.12.10  nfs1.example.com nfs1
10.11.12.11  nfs2.example.com nfs2
10.11.12.20  nfs1-drac
10.11.12.21  nfs2-drac

/etc/cluster/cluster.conf

<?xml version="1.0"?>
<cluster config_version="18" name="nfsclus1">
  <cman expected_votes="1" two_node="1"/>
  <fence_daemon post_fail_delay="5" post_join_delay="15"/>
  <clusternodes>
    <clusternode name="nfs1" nodeid="1">
      <fence>
        <method name="drac">
          <device name="nfs1-drac"/>
        </method>
      </fence>
    </clusternode>
    <clusternode name="nfs2" nodeid="2">
      <fence>
        <method name="drac">
          <device name="nfs2-drac"/>
        </method>
      </fence>
    </clusternode>
  </clusternodes>
  <fencedevices>
    <fencedevice agent="fence_drac5" cmd_prompt="admin1" ipaddr="10.11.12.20" login="root" module_name="nfs1" name="nfs1-drac" passwd="calvin"/>
    <fencedevice agent="fence_drac5" cmd_prompt="admin1" ipaddr="10.11.12.21" login="root" module_name="nfs2" name="nfs2-drac" passwd="calvin"/>
  </fencedevices>
  <rm>
    <failoverdomains>
      <failoverdomain name="nfs-fd" nofailback="1" restricted="1">
        <failoverdomainnode name="nfs1"/>
        <failoverdomainnode name="nfs2"/>
      </failoverdomain>
    </failoverdomains>
    <resources>
      <ip address="10.11.12.25" monitor_link="1" sleeptime="10"/>
      <ip address="10.9.8.7" monitor_link="0" sleeptime="5"/>
      <lvm lv_name="data00" name="nfs-lv" vg_name="vgsan00"/>
      <fs device="/dev/vgsan00/lvdata00" force_fsck="0" force_unmount="0" fsid="64301" fstype="ext4" mountpoint="/san/nfs-fs" name="nfs-fs" options="noatime" self_fence="0"/>
      <nfsserver name="nfs-srv" nfspath=".clumanager/nfs"/>
      <nfsclient allow_recover="on" name="nfsclient1" options="rw,no_root_squash,no_subtree_check" target="10.11.12.0/24"/>
    </resources>
    <service domain="nfs-fd" name="nfs-svc" recovery="relocate">
      <lvm ref="nfs-lv"/>
      <fs ref="nfs-fs">
        <nfsserver ref="nfs-srv">
          <ip ref="10.11.12.25"/>
          <ip ref="10.9.8.7"/>
          <nfsclient ref="nfsclient1"/>
        </nfsserver>
      </fs>
    </service>
  </rm>
</cluster>
```


## References

  - <https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/index.html>
  - <http://www.sourceware.org/cluster/conga/>
  - <https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/5/html/Logical_Volume_Manager_Administration/LVM_Cluster_Overview.html>
  - <http://en.wikipedia.org/wiki/Fencing_%28computing%29>
  - <http://en.wikipedia.org/wiki/Distributed_lock_manager>
  - <http://en.wikipedia.org/wiki/STONITH>
  - <http://en.wikipedia.org/wiki/Network_block_device>