edba: Master node and Node eviction

Master Node :

Master node has the least Node-id in the cluster.Normally the node which joins the cluster first is the master node.

Sometimes on node eviction, the cluster may get divided into two sub-clusters then the sub-cluster containing fewer no. of nodes is evicetd.
If both the sub-clusters have same no. of nodes, the sub-cluster having the master node survives whereas the other sub-cluster is evicted.
If master node gets evicted/rebooted, another node becomes the master.

Master Node can be found by

- by scanning ocssd logs from various nodes
- by scanning crsd logs from various nodes.
- by identifying the node which takes the backup of the OCR.

Node eviction :

Cluster integrity and cluster membership governed by occsd using interconnect and voting disk

Node eviction is reported as ORA-29740 in the alert log

In a 2 node cluster, the node with the lowest node number will be the node the "survives" the eviction. In a n-node cluster, the biggest sub-cluster should survive (votes based)

11gR2 Changes –> Important, in 11GR2, the fencing (eviction) does not to reboot.

Causes of node eviction

1. Missing network heartbeat : network interconnect (Check with RACcheck tool ;use oswnetstat and oswprvtnet and OS Watcher,NETSTAT, IFCONFIG )
crsctl get css misscount [ Default timeout 30 seconds for Linux/Unix]

2. Missing disk heartbeat : Voting disk communications ;Node not able to access a minimum number of the voting files ;CRS-1606 in alert log )
crsctl get css disktimeout [ Default value is 200 (DIsk IO) ]

3. Hardware : CPU starvation issues,
RAM, (Check Free swap and RAM in var/log/messages and /var/log/syslog, /var/adm/messages )

4. Hanging cluster processes : hung ocssd.bin process

5. All NICs in all nodes should have same name and are in same subnet and Verify switch use for only interconnect

6. The size of the control file is increase caused Instability
Solution : setting the control_file_record_keep_time parameter to 1 and increasing the online log file size

7. All nodes not on same same configuration or at different patch level

Solution :

Database Resource Manager : Provide more control over how hardware resources and their allocation; Exadata DBA can use IORM to setup resource allocation

Check “Polling” , “Diskpingout” in cssd.log and crsd.log file.

Rebootless Node Fencing or Eviction

– CW will restart only the offending processs on the node

- Clusterware stopped and try to start OHASD daemon starts CRS daemon

- May still reboot if not able to stop required oracle or IO related process

Scenarios :

NODE EVICTION DUE TO MISSING DISK HEARTBEAT

[root@host03 ~]# service iscsi stop

HOST03:

[root@host03 ~]# tailf /u01/app/11.2.0/grid/log/host03/alerthost03.log

...An I/O error occured for voting file: ORCL:ASMDISK02

...Stale mount point ‘/u01/app/oracle/acfsmount/11.2.0/sharedhome’ was not recovered.

...No I/O has completed after 90% of the maximum interval. Voting file ORCL:ASMDISK03 will be considered not functional

...voting file is offline: ORCL:ASMDISK01

...voting file is offline: ORCL:ASMDISK02

...voting file is offline: ORCL:ASMDISK03

...The number of voting files available, 0, is less than the minimum number of voting files required, 2, resulting in CSSD termination to ensure data integrity;

[root@host03 ~]# tailf /u01/app/11.2.0/grid/log/host03/cssd/ocssd.log
...clssgmFenceClient: fencing client
...clssgmFenceCompletion
...clssnmvDiskAvailabilityChange: voting file ORCL:ASMDISK01 now offline
...clssnmvDiskAvailabilityChange: voting file ORCL:ASMDISK02 now offline
...clssnmvDiskAvailabilityChange: voting file ORCL:ASMDISK03 now offline

HOST01 :

[root@host01 host01]# tailf /u01/app/11.2.0/grid/log/host01/alerthost01.log
...CRS-8011:reboot advisory message from host: host03
...Network communication with node host03 (3) missing for 90% of timeout interval. Removal of this node from cluster in 2.040 seconds.
...Node host03 is being removed from the cluster in cluster incarnation
...CSSD Reconfiguration complete. Active nodes are host01 host02

[root@host01 cssd]# tailf /u01/app/11.2.0/grid/log/host01/cssd/ocssd.log
...host03, node(3) connection failed,

HOST02 :

[root@host02 ~]# tailf /u01/app/11.2.0/grid/log/host02/alerthost02.log

...Resource ‘ora.orcl.db’ has failed on server ‘host03′.

...Resource ‘ora.acfs.dbhome_1.acfs’ has failed on server ‘host03‘.

...CRS-8011:reboot advisory message from host: host03

...Network communication with node host03 (3) missing for 90% of timeout interval. Removal of this node from cluster in 2.040 seconds.

...CSSD Reconfiguration complete. Active nodes are host01 host02

...CRS-5504:Node down event reported for node ‘host03′.

...Server ‘host03′ has been removed from pool ‘Generic’.

...Server ‘host03′ has been removed from pool ‘ora.orcl’.

[root@host02 ~]# tailf /u01/app/11.2.0/grid/log/host02/cssd/ocssd.log

...host03, node(3) connection failed

.. node host03 (3) at 50% heartbeat fatal

Following happens due to Rebootless fencing

Check OCSSD log and Alert Log

...Starting clean up of CRSD resources.

...Clean up of CRSD resources finished successfully.

If you check crsctl check crs , Oracle High Availability Services is online

If you check crsctl stat res -t -init , resources cssd , crsd and HAIP are down on host02

[root@host02 ~]# olsnodes -s
host01 Active

host02 Inactive

[root@host03 ~]# service iscsi start

…CRS-1714:Unable to discover any voting files, retrying discovery in 15 seconds

... CSSD voting file is online:

…CRS-1601:CSSD Reconfiguration complete. Active nodes are host01 host02 .

If you check crsctl stat res -t -init , resources cssd , crsd and HAIP are ONLINE now on host02

[root@host02 ~]# olsnodes -s
host01 Active

host02 Active

NODE EVICTION DUE TO MISSING NETWORK HEARTBEAT

[root@host03 ~]# oifcfg getif

[root@host03 ~]# ifdown eth1 -- Bring down the interconnect

Check OCSSD log and Alert Log

...Starting clean up of CRSD resources.

...Clean up of CRSD resources finished successfully.

If you check crsctl check crs , Oracle High Availability Services is online

If you check crsctl stat res -t -init , resources cssd , crsd and HAIP are down on host02

[root@host02 ~]# ifup eth1 --Restart private interconnect on host02

…CSSD daemon is started in clustered mode

…CSSD Reconfiguration complete. Active nodes are host01 host02 .

…EVMD started on node host02.

…CRSD started on node host02.

If you check crsctl stat res -t -init , resources cssd , crsd and HAIP are ONLINE now on host02

[root@host02 ~]# olsnodes -s
host01 Active

host02 Active

NODE EVICTION DUE TO CSSDAGENT STOPPING

[root@host02 lastgasp]# ps -ef |grep cssd |grep -v grep

root 5085 1 0 09:45 ? 00:00:00 /u01/app/11.2.0/grid/bin/cssdmonitor

root 5106 1 0 09:45 ? 00:00:00 /u01/app/11.2.0/grid/bin/cssdagent

grid 5136 1 0 09:45 ? 00:00:02 /u01/app/11.2.0/grid/bin/ocssd.bin

root@rac1 ~]# kill -STOP 5106; sleep 40; kill -CONT 5106

NODE EVICTION DUE TO MEMBER KILL ESCALATION

[root@host02 ~]# ps -ef | grep ora_ | grep orcl2 | awk ‘{print $2}’ | while read PID ;do; kill -STOP $PID;done

[root@host02 ~]# tailf /u01/app/oracle/diag/rdbms/orcl/orcl1/trace/alert_orcl1.log

LMS process issues a request to CSSD to reboot the node.

...LMS0 (ospid: 31771) has detected no messaging activity from instance 2

...LMS0 (ospid: 31771) issues an IMR to resolve the situation

...Remote instance kill is issued with system inc 30

...LMON received an instance eviction notification from instance 1

...Beginning instance recovery of 1 threads

[root@host02 ~]# tailf /u01/app/11.2.0/grid/log/host01/alerthost01.log

...Node host02 is being evicted in cluster incarnation

...Node down event reported for node ‘host02‘.

...Node 2 joins the cluster

...CSSD Reconfiguration complete. Active nodes are host01 host02

Errors:

[ohasd(162723)]CRS-8011:reboot advisory message from host: oranode02,

[cssd(14493)]CRS-1612:Network communication with node oranode02 (2) missing for 50% of timeout interval

DB Log :
LMS0 (ospid: 31771) has detected no messaging activity from instance 2 LMS0 (ospid: 31771) issues an IMR to resolve the situation
Remote instance kill is issued with system inc 30
Remote instance kill map (size 1) : 2 LMON received an instance eviction notification from instance 1

CRS Log :
CRS-1607:Node oranode02 is being evicted in cluster incarnation 267943929.

CRS-1606:The number of voting files available, 0, is less than the minimum number of voting files required, 1, resulting in CSSD termination to ensure data integrity;

CRS-1652:Starting clean up of CRSD resources.

Log location :

From 11.2 onwards, in the Cluster Alert Log, the process responsible would be mentioned

Clusterware alert log in <GRID_HOME>/log/<nodename>
The cssdagent logs in <GRID_HOME>/log/<nodename>/agent/ohasd/oracssdagent_root
The cssdmonitor logs in <GRID_HOME>/log/<nodename>/agent/ohasd/oracssdmonitor_root
The ocssd logs in <GRID_HOME>/log/<nodename>/cssd
The OCLSKD log in <GRID_HOME>/log/<nodename>client/oclskd.log
The lastgasp logs in /etc/oracle/lastgasp/cssagent_<node>.lg or /var/opt/oracle/lastgasp

Cluster Health Monitor & OS Watcher data
Message Files
– Linux: /var/log/messages
– Sun: /var/adm/messages
– HP-UX: /var/adm/syslog/syslog.log
– IBM: /bin/errpt -a > messages.out

CRS log files (For 10.2.0 and above 10.2.0 release)
=============================================
1. $ORACLE_CRS_HOME/log//crsd/crsd.log
2. $ORACLE_CRS_HOME/log//cssd/ocssd.log
3. $ORACLE_CRS_HOME/log//evmd/evmd.log
4. $ORACLE_CRS_HOME/log//alert.log
5. $ORACLE_CRS_HOME/log//client/cls*.log (not all files but only latest files matching with timestamp of node reboot)
6. $ORACLE_CRS_HOME/log//racg/ (Please check files and directories matching with timestamp of reboot and if found then copy otherwise not required)
7. The latest .oprocd.log file from /etc/oracle or /var/opt/oracle/oprocd (Solaris)

Note: We can use $ORACLE_CRS_HOME/bin/diagcollection.pl to collect above files but it doesn’t collect OPROCD logfiles, OS log files and OS watcher logfiles and also it may take lot of time to run and consume resources so it’s better to copy manually.

OS log files
====================================================
1. /var/log/syslog
2. /var/adm/messages
3. errpt –a >error_.log (AIX only)

c. OS Watcher log files (This will get overwritten so we need to copy soon)
=======================================================
Please check in crontab where OSwatcher is installed. Go to that directory and then archive folder and then collect files from all directory matching with timestamp of node reboot.
1. OS_WATCHER_HOME/archive/oswtop
2. OS_WATCHER_HOME/archive/oswvmstat
3. OS_WATCHER_HOME/archive/oswmpstat
4. OS_WATCHER_HOME/archive/oswnetstat
5. OS_WATCHER_HOME/archive/oswiostat
6. OS_WATCHER_HOME/archive/oswps
7. OS_WATCHER_HOME/archive/oswprvtnet

http://db.geeksinsight.com/2012/12/27/oracle-rac-node-evictions-11gr2-node-eviction-means-restart-of-cluster-stack-not-reboot-of-node/

http://www.dbas-oracle.com/2013/06/Top-4-Reasons-Node-Reboot-Node-Eviction-in-Real-Application-Cluster-RAC-Environment.html

Node eviction :

Check /var/log/messages if the Free swap is less

check oswnetstat and oswprvtnet.

Run OS Watcher : Download and ./startOSW.sh 20 24 gzip

Voting Disk not Rechable : CRS-1606 error will be reported in DB alert log and details will be in /u01/app/11.2.0/grid/log/<sid>/cssd/ocssd.log

Check crsctl query css votedisk
Check voting disk permission
check OS, SAN, and storage logs

Disk : check DBA alert log and ,ASM alert log

Missed Network Heartbeat (NHB). :
Check OS statistics from the evicted node.
Check communication over the private network.
RACcheck tool to check RAC OS network settings.

./startOSW.sh 20 24 gzip

Pages

Master node and Node eviction