Oracle 12c has introduced Flex Clusters which use hub-and-spoke technology, allowing the cluster to scale much beyond the pre-12c clusters as it requires:

  • Fewer network interactions between nodes in the cluster, and :
  • Less contention for key Clusterware resources like OCR and voting disks.

A Flex Cluster has two types of nodes: Hub Nodes and Leaf Nodes.

Hub Nodes

  • These nodes are essentially same as conventional nodes in Pre-12c clusters and form the core of the cluster.
  • Each Hub Node is connected with other Hub Nodes via private network for peer-to-peer communication.
  • Each Hub Node can access the shared storage and hence the OCR and voting disks lying on the shared storage.  
  • A Hub Node may host an ASM instance, database instance(s) and applications.
  • Each cluster must have at least one Hub Node and can have up to 64 Hub Nodes.

Leaf Nodes

  • Leaf Nodes are more loosely coupled to the cluster than Hub Nodes and are not connected among themselves.
  • Each Leaf Node is connected to the cluster through a Hub Node through which it requests the data.
  • Though Leaf Nodes do not require direct access to shared storage, they may be provided access so that they can be changed to a Hub Node in future.
  • They run a lightweight version of the Clusterware.
  • They cannot host database or ASM instances.
  • Leaf Nodes can host different types of applications e.g. Fusion Middleware, EBS, IDM, etc. The applications on Leaf Nodes can failover to a different node if the Leaf Node fails.
  • There may be zero or more Leaf Nodes in a flex cluster.
  • All Leaf Nodes are on the same public and private network as the Hub Nodes.

Hub Nodes can run in an Oracle Flex Cluster configuration without having any Leaf Nodes as cluster member nodes, but for Leaf Node(s) to be part of a cluster, the cluster must have at least one Hub Node. When Clusterware is started on a Leaf Node, the Leaf Node automatically uses GNS to discover the Hub Nodes and gets connected to the cluster through one of the Hub Nodes. One Hub node may be associated with zero or more Leaf Nodes. The Hub Node periodically exchanges heartbeat messages with the associated Leaf Nodes, so that Leaf Nodes can participate in the cluster.

A Standard Cluster can be changed to a Flex Cluster, but a Flex Cluster cannot be changed to Standard Cluster without reconfiguration.

What happens when a Hub Node ceases to be a part of the cluster?

A Hub Node can be removed from the cluster as a result of:

  • Getting evicted
  • Server shutdown
  • Manually stopping the Oracle Clusterware

In such a scenario, the Leaf Nodes associated with that Hub Node failover to one of the surviving nodes in the cluster.

In this article, I will demonstrate:

  • Identification of the Hub Node a Leaf Node is connected to
  • Failover of a Leaf Node following removal of the associated Hub Node from the cluster

Current Scenario:

For the purpose of this demonstration I have setup a 12.1.0.2c Flex Cluster having following nodes:

  • Hub Nodes
    • Host01
    • Host02
    • Host03
  • Leaf Nodes
    • Host04
    • Host05

Demonstration

Let us verify that currently Hub Node host01 and Leaf Node host04 are active:

[root@host01 log]# crsctl get node role status -all
Node 'host01' active role is 'hub'
Node 'host04' active role is 'leaf'

Since host01 is the only one Hub Node active in the cluster currently, Leaf Node host04 is associated with host01. It can be verified by looking the trace file of the ocssdrim process on host04.

[root@host04 trace]#export ORACLE_BASE=/u01/app/grid
 [root@host04 ~]# cat  $ORACLE_BASE/diag/crs/host04/crs/trace/ocssdrim.trc |grep 'Sending a ping msg to' | tail -1
2016-05-04 10:50:15.315138 :    CSSD:1085761856: clssbnmc_PeriodicPing_CB: Sending a ping msg to host host01, number 1, using handle (0x137e5d0) last msg to hub at 4294684730, connection timeout at 4294714730, current time 4294686330

Let us start another Hub Node host02 and Leaf Node host05.

[root@host01 log]# crsctl get node role status -all
Node 'host01' active role is 'hub'
Node 'host02' active role is 'hub'
Node 'host04' active role is 'leaf'
Node 'host05' active role is 'leaf'

In order to find the Hub Node associated with Leaf Node host05, we will take a look at the trace file of ocssdrim process on host05:

[root@host05 trace]#export ORACLE_BASE=/u01/app/grid
[root@host05 ~]# cat  $ORACLE_BASE/diag/crs/host05/crs/trace/ocssdrim.trc |grep 'Sending a ping msg to' | tail -1
2016-05-04 11:12:01.008283 :    CSSD:1086187840: clssbnmc_PeriodicPing_CB: Sending a ping msg to host host01, number 1, using handle (0x14055d0) last msg to hub at 4294948750, connection timeout at 11454, current time 4294951260]]

We can see that Leaf Node host05 is also connected to Hub Node host01.

Let’s stop Oracle Clusterware on host01 to verify that both of the Leaf Nodes fail over to the only other surviving Hub Node in the cluster, i.e. host02.

[root@host01 log]# crsctl stop crs
[root@host02 ~]# crsctl get node role status -all
Node 'host02' active role is 'hub'
Node 'host04' active role is 'leaf'
Node 'host05' active role is 'leaf'

Verify that that host04 has failed over to host02:

[root@host04 ~]# cat  $ORACLE_BASE/diag/crs/host04/crs/trace/ocssdrim.trc |grep 'Destroying connection' | tail -1
2016-05-04 11:17:31.932770 :    CSSD:1085761856: clssbnmConnDestroy: Destroying connection object (0x1061200) for host host01
 [root@host04 ~]# cat  $ORACLE_BASE/diag/crs/host04/crs/trace/ocssdrim.trc |grep 'Sending a ping msg to' | tail -1
2016-05-04 11:18:21.860771 :    CSSD:1085761856: clssbnmc_PeriodicPing_CB: Sending a ping msg to host host02, number 2, using handle (0x17e2fe0) last msg to hub at 1404044, connection timeout at 1434044, current time 1405324

Verify that that host05 has also failed over to host02:

[root@host05 ~]# cat $ORACLE_BASE/diag/crs/host05/crs/trace/ocssdrim.trc |grep ‘Destroying connection’ | tail -1

[root@host05 ~]# cat  $ORACLE_BASE/diag/crs/host05/crs/trace/ocssdrim.trc |grep 'Destroying connection' | tail -1
2016-05-04 11:17:31.873993 :    CSSD:1086187840: clssbnmConnDestroy: Destroying connection object (0x16979f0) for host host01
 [root@host05 ~]# cat  $ORACLE_BASE/diag/crs/host05/crs/trace/ocssdrim.trc |grep 'Sending a ping msg to' | tail -1

2016-05-04 11:17:36.751628 :    CSSD:1086187840: clssbnmc_PeriodicPing_CB: Sending a ping msg to host host02, number 2, using handle (0x10950b0) last msg to hub at 318184, connection timeout at 348184, current time 319664

Summary:

  • Oracle 12c has introduced Flex Clusters which have two types of nodes: Hub Nodes and Leaf Nodes.
  • Whereas each Hub Node can access the shared storage, Leaf Nodes do not require direct access to shared storage and are connected to the cluster through Hub Nodes.
  • When Clusterware is started on a Leaf Node, the Leaf Node automatically uses GNS to discover the Hub Nodes and gets connected to the cluster through one of the Hub Nodes.
  • If a Hub Node ceases to be part of the cluster, the Leaf Nodes associated with it failover to one of the surviving nodes in the cluster.

References:

Tags: , , ,