In $ORA_CRS_HOME/bin/racgvip script , CHECK_TIMES determine seconds in whichit should get response. If the gateway was slow in responding to a ping request , then racgvip would assume that the Interface is down and failover VIP
Action plan:
./runcluvfy.sh stage -post crsinst -n all -verbose
./runcluvfy.sh stage -pre crsinst -n all -verbose
or
cluvfy stage -post crsinst -n all -verbose
cluvfy stage -pre crsinst -n all -verbose
1. Please upload the following logs of all two nodes:
$CRS_HOME/log/nodename
$CRS_HOME/log/nodename/crsd/*.log
$CRS_HOME/log/nodename/cssd/*.log
$CRS_HOME/log/nodename/racg/*.log -- logfiles for VIP and ONS
$CRS_HOME/log/nodename/client/*.log
$CRS_HOME/log/nodename/evmd/*.log
/etc/oracle/oprocd/*.log.* or /var/opt/oracle/oprocd/*.log.* (If have)
$crs_stat –t
$crsctl check crs
$crsctl check boot
Vip relocated to the second sever :
Sometime you may find vip relocated to second server
In this case , ping <vip> may work but ipconfig <vip> om Server1 will fail.
oracle@Server1> crsctl status resource -t
NAME TARGET STATE SERVER STATE_DETAILS Local Resources
----------------------------------------------------------------------------------------------------------
ora.svrb1hr.vip 1 ONLINE INTERMEDIATE svrb2hr FAILED OVER
In listener log or while starting listener it may give
TNS-00515: Connect failed because target host or object does not exist
Linux Error: 99: Cannot assign requested address
In OS logs , you may see
kernel: igb: eth2 NIC Link is Down
In crs log , you may see
Received state change for ora.net1.network exadb01 1 [old state = ONLINE, new state = OFFLINE]
CRS-0215: Could not start resource 'ora.net1.network'.
Set State Details to [FAILED OVER] from [ ] for [ora.exadb01.vip 1 1]
CRS-2676: Start of 'ora.exadb01.vip' on 'server2' succeeded
Solution :
oracle@Server2>$CRS_HOME/bin/crs_relocate ora.grac1.vip
or
Check current VIP status:
Server1 > $ crsctl status resource ora.grac1.vip
NAME=ora.grac1.vip
TYPE=ora.cluster_vip_net1.type
TARGET=ONLINE
STATE=INTERMEDIATE on grac2
Stop the VIP resource:
Server1 >$ crsctl stop resource ora.grac1.vip
CRS-2673: Attempting to stop 'ora.grac1.vip' on 'grac2'
CRS-2677: Stop of 'ora.grac1.vip' on 'grac2' succeeded
Start the VIP resource:
Server1 >$ crsctl start resource ora.grac1.vip
CRS-2672: Attempting to start 'ora.grac1.vip' on 'grac1'
CRS-2676: Start of 'ora.grac1.vip' on 'grac1' succeeded
Verify VIP resource:
Server1 > $ crsctl status resource ora.grac1.vip
NAME=ora.grac1.vip
TYPE=ora.cluster_vip_net1.type
TARGET=ONLINE
STATE=ONLINE on grac1
Data Collection
Please consult your sysadmin and make sure that the gateway is pingable all the time
1- test the gw on every node
consult your sysadmin to create a crontab unix shell script to ping the
gateway of your public interface every 2 seconds for example and the result is to be
spooled in /tmp/test_gw_
ping your gateway and upload the ping log
2- increase the tracing level of the vip resource as root user
# cd $ORA_CRS_HOME/bin
# crsctl debug log res
3- restart the clusterware
4- execute this test on both nodes at the same time
$ script /tmp/testvip_
$ cd $ORA_CRS_HOME/bin
$ hostname
$ date
$ cat /etc/hosts
$ ifconfig -a
$ oifcfg getif
$ netstat -rn
$ oifcfg iflist
$ srvctl config nodeapps -n
$ crs_stat –t
$ exit
5- reset the tracing level of the vip resource as root user
# cd $ORA_CRS_HOME/bin
# crsctl debug log res
# crsctl debug log res
Up on the next occurence, please upload the following information from all nodes
a- /tmp/test_gw_
c- the crsd log
d. The resource racg
$ORA_CRS_HOME/log/
e. the racgvip script from
$ORA_CRS_HOME/bin/racgvip
f- RDA from all the nodes
g- the o/s message file (From 11gR2, OS logs are part of diagcollection/TAF Linux, Solaris, HP-UX)
IBM: /bin/errpt -a > messages.out
Linux: /var/log/messages
Solaris: /var/adm/messages
RAC11G - Change VIP IP
Current Status
[root@rac1 bin]# ./srvctl config nodeapps -n rac1 -a
VIP exists.:rac1
VIP exists.: /rac1-vip/192.168.2.111/255.255.255.0/eth0
[root@rac1 bin]# ./srvctl config nodeapps -n rac2 -a
VIP exists.:rac2
VIP exists.: /rac2-vip/192.168.2.112/255.255.255.0/eth0
[oracle@rac1 ~]$ srvctl config nodeapps -a
VIP exists.:rac1
VIP exists.: /rac1-vip/192.168.2.111/255.255.255.0/eth0
VIP exists.:rac2
VIP exists.: /rac2-vip/192.168.2.112/255.255.255.0/eth0
[oracle@rac1 ~]$
[oracle@rac1 ~]$ srvctl config nodeapps -a
VIP exists.:rac1
VIP exists.: /rac1-vip/192.168.2.111/255.255.255.0/eth0
VIP exists.:rac2
VIP exists.: /rac2-vip/192.168.2.112/255.255.255.0/eth0
Stop the VIP only or all nodeapps
[oracle@rac1 ~]$ srvctl stop vip -n rac1 -f
PRCC-1017 : rac1-vip was already stopped on rac1
or
srvctl stop database -d TEST
srvctl stop nodeapps -n rac1
srvctl stop nodeapps -n rac2
crs_stat -t
Modification
[root@rac1 bin]# ./srvctl modify nodeapps -n rac1 -A 192.168.0.73/255.255.255.0/eth0
[root@rac1 bin]# ./srvctl modify nodeapps -n rac2 -A 192.168.0.74/255.255.255.0/eth0
Verification
[root@rac1 bin]# ./srvctl config nodeapps -n rac1 -a
VIP exists.:rac1
VIP exists.: /rac1-vip/192.168.0.73/255.255.255.0/eth0
[root@rac1 bin]# ./srvctl config nodeapps -n rac2 -a
VIP exists.:rac2
VIP exists.: /rac2-vip/192.168.0.74/255.255.255.0/eth0
[root@rac1 bin]# ./srvctl config nodeapps -a
VIP exists.:rac1
VIP exists.: /rac1-vip/192.168.0.73/255.255.255.0/eth0
VIP exists.:rac2
VIP exists.: /rac2-vip/192.168.0.74/255.255.255.0/eth0
[root@rac1 bin]# ./crsctl stat res -t
--------------------------------------------------------------------------------
NAME TARGET STATE SERVER STATE_DETAILS
--------------------------------------------------------------------------------
Local Resources
--------------------------------------------------------------------------------
ora.DATA.dg
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.LISTENER.lsnr
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.asm
ONLINE ONLINE rac1 Started
ONLINE ONLINE rac2 Started
ora.eons
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.gsd
OFFLINE OFFLINE rac1
OFFLINE OFFLINE rac2
ora.net1.network
ONLINE ONLINE rac1
ONLINE ONLINE rac2
ora.ons
ONLINE ONLINE rac1
ONLINE ONLINE rac2
--------------------------------------------------------------------------------
Cluster Resources
--------------------------------------------------------------------------------
ora.LISTENER_SCAN1.lsnr
1 ONLINE ONLINE rac1
ora.oc4j
1 OFFLINE OFFLINE
ora.rac.db
1 OFFLINE OFFLINE
2 OFFLINE OFFLINE
ora.rac1.vip
1 ONLINE INTERMEDIATE rac2 FAILED OVER
ora.rac2.vip
1 OFFLINE OFFLINE
ora.scan1.vip
1 ONLINE OFFLINE
If only Vip was stopped earlier , stop and restart listener resource
[root@rac1 bin]# ./crsctl stop resource ora.LISTENER.lsnr
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'rac1'
CRS-2673: Attempting to stop 'ora.LISTENER.lsnr' on 'rac2'
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'rac2' succeeded
CRS-2677: Stop of 'ora.LISTENER.lsnr' on 'rac1' succeeded
[root@rac1 bin]# ./crsctl stop resource ora.rac1.vip
CRS-2673: Attempting to stop 'ora.rac1.vip' on 'rac2'
CRS-2677: Stop of 'ora.rac1.vip' on 'rac2' succeeded
[root@rac1 bin]# ./crsctl stop resource ora.rac2.vip
CRS-2500: Cannot stop resource 'ora.rac2.vip' as it is not running
CRS-4000: Command Stop failed, or completed with errors.
[root@rac1 bin]# ./crsctl start resource ora.LISTENER.lsnr
CRS-2672: Attempting to start 'ora.rac2.vip' on 'rac2'
CRS-2672: Attempting to start 'ora.rac1.vip' on 'rac1'
CRS-2676: Start of 'ora.rac2.vip' on 'rac2' succeeded
CRS-2672: Attempting to start 'ora.LISTENER.lsnr' on 'rac2'
CRS-2676: Start of 'ora.rac1.vip' on 'rac1' succeeded
CRS-2672: Attempting to start 'ora.LISTENER.lsnr' on 'rac1'
CRS-2676: Start of 'ora.LISTENER.lsnr' on 'rac2' succeeded
CRS-2676: Start of 'ora.LISTENER.lsnr' on 'rac1' succeeded
[root@rac1 bin]# ./srvctl start vip -n rac1
PRKO-2420 : VIP is already started on node(s): rac1
[root@rac1 bin]# ./srvctl start vip -n rac2
PRKO-2420 : VIP is already started on node(s): rac2
If nodeapps were stopted, Restart the cluster services
#crsctl stop crs
#crsctl start crs
[root@rac1 bin]# ./crsctl stat res -t
Note 330358.1 - CRS 10gR2/ 11gR1/ 11gR2 Diagnostic Collection Guide
Note 298895.1 - Modifying the default gateway address used by the Oracle 10g VIP
Note 399213.1 - VIP Going Offline Intermittantly - Slow Response from Default Gateway
Note 401783.1 - Changes in Oracle Clusterware after applying 10.2.0.3 Patchset