In this article, we will see that how to configure two node Redhat cluster using pacemaker & corosync on REHL 7.2. Once you have installed the necessary packages, you need to enable the cluster services at the system start-up. You must start the necessary cluster services before kicking off the cluster configuration. “hacluster” user will be created automatically during the package installation with disabled password. Corosync will use this user to sync the cluster configuration, starting and stopping the cluster on cluster nodes.
Environment:
- Operating System: Redhat Enterprise Linux 7.2
- Type of Cluster : Two Node cluster – Failover
- Nodes: UA-HA & UA-HA2 (Assuming that packages have been installed on both the nodes)
- Cluster Resource : KVM guest (VirtualDomain) – See in Next Article.
Hardware configuration:
- CPU – 2
- Memory – 4GB
- NFS – For shared storage
Enable & Start the Services on both the Nodes:
1.Login to both the cluster nodes as root user.
2. Enable the pcsd daemon on both the nodes to start automatically across the reboot. pcsd is pacemaker configuration daemon. (Not a cluster service)
[root@UA-HA ~]# systemctl start pcsd.service [root@UA-HA ~]# systemctl enable pcsd.service Created symlink from /etc/systemd/system/multi-user.target.wants/pcsd.service to /usr/lib/systemd/system/pcsd.service. [root@UA-HA ~]# systemctl status pcsd.service ● pcsd.service - PCS GUI and remote configuration interface Loaded: loaded (/usr/lib/systemd/system/pcsd.service; enabled; vendor preset: disabled) Active: active (running) since Sun 2015-12-27 23:22:08 EST; 14s ago Main PID: 18411 (pcsd) CGroup: /system.slice/pcsd.service ├─18411 /bin/sh /usr/lib/pcsd/pcsd start ├─18415 /bin/bash -c ulimit -S -c 0 >/dev/null 2>&1 ; /usr/bin/ruby -I/usr/lib/pcsd /usr/lib/pcsd/ssl.rb └─18416 /usr/bin/ruby -I/usr/lib/pcsd /usr/lib/pcsd/ssl.rb Dec 27 23:22:07 UA-HA systemd[1]: Starting PCS GUI and remote configuration interface... Dec 27 23:22:08 UA-HA systemd[1]: Started PCS GUI and remote configuration interface. [root@UA-HA ~]#
3. Set the new password for cluster user “hacluster” on both the nodes.
[root@UA-HA ~]# passwd hacluster Changing password for user hacluster. New password: Retype new password: passwd: all authentication tokens updated successfully. [root@UA-HA ~]# [root@UA-HA2 ~]# passwd hacluster Changing password for user hacluster. New password: Retype new password: passwd: all authentication tokens updated successfully. [root@UA-HA2 ~]#
Configure corosync & Create new cluster:
1. Login to any of the cluster node and authenticate “hacluster” user.
[root@UA-HA ~]# pcs cluster auth UA-HA UA-HA2 Username: hacluster Password: UA-HA: Authorized UA-HA2: Authorized [root@UA-HA ~]#
2.Create a new cluster using pcs command.
[root@UA-HA ~]# pcs cluster setup --name UABLR UA-HA UA-HA2 Shutting down pacemaker/corosync services... Redirecting to /bin/systemctl stop pacemaker.service Redirecting to /bin/systemctl stop corosync.service Killing any remaining services... Removing all cluster configuration files... UA-HA: Succeeded UA-HA2: Succeeded Synchronizing pcsd certificates on nodes UA-HA, UA-HA2... UA-HA: Success UA-HA2: Success Restaring pcsd on the nodes in order to reload the certificates... UA-HA: Success UA-HA2: Success [root@UA-HA ~]#
3. Check the cluster status .
[root@UA-HA ~]# pcs status Error: cluster is not currently running on this node [root@UA-HA ~]#
You see the error because , cluster service is not started.
4. Start the cluster using pcs command. “–all” will start the cluster on all the configured nodes.
[root@UA-HA ~]# pcs cluster start --all UA-HA2: Starting Cluster... UA-HA: Starting Cluster... [root@UA-HA ~]#
In the back-end , “pcs cluster start” command will trigger the following command on each cluster node.
# systemctl start corosync.service # systemctl start pacemaker.service
5. Check the cluster services status.
[root@UA-HA ~]# systemctl status corosync ● corosync.service - Corosync Cluster Engine Loaded: loaded (/usr/lib/systemd/system/corosync.service; disabled; vendor preset: disabled) Active: active (running) since Sun 2015-12-27 23:34:31 EST; 11s ago Process: 18994 ExecStart=/usr/share/corosync/corosync start (code=exited, status=0/SUCCESS) Main PID: 19001 (corosync) CGroup: /system.slice/corosync.service └─19001 corosync Dec 27 23:34:31 UA-HA corosync[19001]: [VOTEQ ] Waiting for all cluster members. Current votes: 1 expected_votes: 2 Dec 27 23:34:31 UA-HA corosync[19001]: [VOTEQ ] Waiting for all cluster members. Current votes: 1 expected_votes: 2 Dec 27 23:34:31 UA-HA corosync[19001]: [QUORUM] Members[1]: 1 Dec 27 23:34:31 UA-HA corosync[19001]: [MAIN ] Completed service synchronization, ready to provide service. Dec 27 23:34:31 UA-HA corosync[19001]: [TOTEM ] A new membership (192.168.203.131:1464) was formed. Members joined: 2 Dec 27 23:34:31 UA-HA corosync[19001]: [QUORUM] This node is within the primary component and will provide service. Dec 27 23:34:31 UA-HA corosync[19001]: [QUORUM] Members[2]: 2 1 Dec 27 23:34:31 UA-HA corosync[19001]: [MAIN ] Completed service synchronization, ready to provide service. Dec 27 23:34:31 UA-HA systemd[1]: Started Corosync Cluster Engine. Dec 27 23:34:31 UA-HA corosync[18994]: Starting Corosync Cluster Engine (corosync): [ OK ] [root@UA-HA ~]# systemctl status pacemaker ● pacemaker.service - Pacemaker High Availability Cluster Manager Loaded: loaded (/usr/lib/systemd/system/pacemaker.service; disabled; vendor preset: disabled) Active: active (running) since Sun 2015-12-27 23:34:32 EST; 15s ago Main PID: 19016 (pacemakerd) CGroup: /system.slice/pacemaker.service ├─19016 /usr/sbin/pacemakerd -f ├─19017 /usr/libexec/pacemaker/cib ├─19018 /usr/libexec/pacemaker/stonithd ├─19019 /usr/libexec/pacemaker/lrmd ├─19020 /usr/libexec/pacemaker/attrd ├─19021 /usr/libexec/pacemaker/pengine └─19022 /usr/libexec/pacemaker/crmd Dec 27 23:34:33 UA-HA crmd[19022]: notice: pcmk_quorum_notification: Node UA-HA2[2] - state is now member (was (null)) Dec 27 23:34:33 UA-HA crmd[19022]: notice: pcmk_quorum_notification: Node UA-HA[1] - state is now member (was (null)) Dec 27 23:34:33 UA-HA stonith-ng[19018]: notice: Watching for stonith topology changes Dec 27 23:34:33 UA-HA crmd[19022]: notice: Notifications disabled Dec 27 23:34:33 UA-HA crmd[19022]: notice: The local CRM is operational Dec 27 23:34:33 UA-HA crmd[19022]: notice: State transition S_STARTING -> S_PENDING [ input=I_PENDING cause=C_FSA_INTERNAL origin=do_started ] Dec 27 23:34:33 UA-HA attrd[19020]: warning: Node names with capitals are discouraged, consider changing 'UA-HA2' to something else Dec 27 23:34:33 UA-HA attrd[19020]: notice: crm_update_peer_proc: Node UA-HA2[2] - state is now member (was (null)) Dec 27 23:34:33 UA-HA stonith-ng[19018]: warning: Node names with capitals are discouraged, consider changing 'UA-HA2' to something else Dec 27 23:34:34 UA-HA stonith-ng[19018]: notice: crm_update_peer_proc: Node UA-HA2[2] - state is now member (was (null)) [root@UA-HA ~]#
Verify Corosync configuration:
1. Check the corosync communication status.
[root@UA-HA ~]# corosync-cfgtool -s Printing ring status. Local node ID 1 RING ID 0 id = 192.168.203.134 status = ring 0 active with no faults [root@UA-HA ~]#
In my setup, first RING is using interface “br0”.
[root@UA-HA ~]# ifconfig br0 br0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 192.168.203.134 netmask 255.255.255.0 broadcast 192.168.203.255 inet6 fe80::84ef:2eff:fee9:260a prefixlen 64 scopeid 0x20 ether 00:0c:29:2d:3f:ce txqueuelen 0 (Ethernet) RX packets 15797 bytes 1877460 (1.7 MiB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 7018 bytes 847881 (828.0 KiB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0 [root@UA-HA ~]#
We can have multiple RINGS to provide the redundancy for the cluster communication. (We use to call LLT links in VCS )
2. Check the membership and quorum API’s.
[root@UA-HA ~]# corosync-cmapctl | grep members runtime.totem.pg.mrp.srp.members.1.config_version (u64) = 0 runtime.totem.pg.mrp.srp.members.1.ip (str) = r(0) ip(192.168.203.134) runtime.totem.pg.mrp.srp.members.1.join_count (u32) = 1 runtime.totem.pg.mrp.srp.members.1.status (str) = joined runtime.totem.pg.mrp.srp.members.2.config_version (u64) = 0 runtime.totem.pg.mrp.srp.members.2.ip (str) = r(0) ip(192.168.203.131) runtime.totem.pg.mrp.srp.members.2.join_count (u32) = 1 runtime.totem.pg.mrp.srp.members.2.status (str) = joined [root@UA-HA ~]# [root@UA-HA ~]# pcs status corosync Membership information ---------------------- Nodeid Votes Name 2 1 UA-HA2 1 1 UA-HA (local) [root@UA-HA ~]#
Verify Pacemaker Configuration:
1. Check the running pacemaker processes.
[root@UA-HA ~]# ps axf |grep pacemaker 19324 pts/0 S+ 0:00 | \_ grep --color=auto pacemaker 19016 ? Ss 0:00 /usr/sbin/pacemakerd -f 19017 ? Ss 0:00 \_ /usr/libexec/pacemaker/cib 19018 ? Ss 0:00 \_ /usr/libexec/pacemaker/stonithd 19019 ? Ss 0:00 \_ /usr/libexec/pacemaker/lrmd 19020 ? Ss 0:00 \_ /usr/libexec/pacemaker/attrd 19021 ? Ss 0:00 \_ /usr/libexec/pacemaker/pengine 19022 ? Ss 0:00 \_ /usr/libexec/pacemaker/crmd
2. Check the cluster status.
[root@UA-HA ~]# pcs status Cluster name: UABLR WARNING: no stonith devices and stonith-enabled is not false Last updated: Sun Dec 27 23:44:44 2015 Last change: Sun Dec 27 23:34:55 2015 by hacluster via crmd on UA-HA Stack: corosync Current DC: UA-HA (version 1.1.13-10.el7-44eb2dd) - partition with quorum 2 nodes and 0 resources configured Online: [ UA-HA UA-HA2 ] Full list of resources: PCSD Status: UA-HA: Online UA-HA2: Online Daemon Status: corosync: active/disabled pacemaker: active/disabled pcsd: active/enabled [root@UA-HA ~]#
3. You can see that corosync & pacemaker is active now and disabled across the system reboot. If you would like to start the cluster automatically across the reboot, you can enable it using systemctl command.
[root@UA-HA2 ~]# systemctl enable corosync Created symlink from /etc/systemd/system/multi-user.target.wants/corosync.service to /usr/lib/systemd/system/corosync.service. [root@UA-HA2 ~]# systemctl enable pacemaker Created symlink from /etc/systemd/system/multi-user.target.wants/pacemaker.service to /usr/lib/systemd/system/pacemaker.service. [root@UA-HA2 ~]# pcs status Cluster name: UABLR WARNING: no stonith devices and stonith-enabled is not false Last updated: Sun Dec 27 23:51:30 2015 Last change: Sun Dec 27 23:34:55 2015 by hacluster via crmd on UA-HA Stack: corosync Current DC: UA-HA (version 1.1.13-10.el7-44eb2dd) - partition with quorum 2 nodes and 0 resources configured Online: [ UA-HA UA-HA2 ] Full list of resources: PCSD Status: UA-HA: Online UA-HA2: Online Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled [root@UA-HA2 ~]#
4. When the cluster starts, it automatically records the number and details of the nodes in the cluster, as well as which stack is being used and the version of Pacemaker being used. To view the cluster configuration (Cluster Information Base – CIB) in XML format, use the following command.
[root@UA-HA2 ~]# pcs cluster cib
5. Verify the cluster information base using the following command.
[root@UA-HA ~]# crm_verify -L -V error: unpack_resources: Resource start-up disabled since no STONITH resources have been defined error: unpack_resources: Either configure some or disable STONITH with the stonith-enabled option error: unpack_resources: NOTE: Clusters with shared data need STONITH to ensure data integrity Errors found during check: config not valid [root@UA-HA ~]#
By default pacemaker enables STONITH (Shoot The Other Node In The Head ) / Fencing in an order to protect the data. Fencing is mandatory when you use the shared storage to avoid the data corruptions.
For time being , we will disable the STONITH and configure it later.
6. Disable the STONITH (Fencing)
[root@UA-HA ~]#pcs property set stonith-enabled=false [root@UA-HA ~]# [root@UA-HA ~]# pcs property show stonith-enabled Cluster Properties: stonith-enabled: false [root@UA-HA ~]#
7. Verify the cluster configuration again. Hope the errors will be disappear
[root@UA-HA ~]# crm_verify -L -V [root@UA-HA ~]#
We have successfully configured two node redhat cluster on RHEL 7.2 with new components pacemaker and corosync. Hope this article is informative to you.
Share it ! Comment it !! Be Sociable !!!
Gaurav says
I faced an issue with corosync as it did not start,the issue was there was no corosync.conf file.so i manually created it and changed the credientials.