Wednesday, June 24, 2009

Test the failover of a NIC within a MultiNICA resource in Veritas Cluster

The MultiNICA agent represents a set of network interfaces and provides failover capabilities between them.

Each interface in a MultiNICA resource has a base IP address, which can be the same or different. The MultiNICA agent configures one interface at a time. If it does not detect activity on the configured interface, it configures a new interface and migrates IP aliases to it.

It's always a good idea to test the resource after configuring it and before putting the cluster into a production environment.

To illustrate the correct and incorrect way to test a MultiNICA resource, let's consider the following example with a simple service group:

group test_multiNIC (
SystemList = { node1 = 1, node2 }
)

IPMultiNIC IPMNIC (
Address = "192.200.99.5"
MultiNICResName = MNIC
)

MultiNICA MNIC (
Device @node1 = { hme0 = "192.200.100.1", qfe0 = "192.200.100.1" }
Device @node2 = { hme0 = "192.200.100.2", qfe0 = "192.200.100.2" }
)

IPMNIC requires MNIC


// resource dependency tree
//
// group test_multiNIC
// {
// IPMultiNIC IPMNIC
// {
// MultiNICA MNIC
// }
// }


Correct way to test the NIC failover:

1. Bring the service group ONLINE:
# hagrp -online test_multiNIC -sys node1

2. Verify that the primary NIC (first NIC in the Device attribute) is properly set up:
# ifconfig -a (on node1)

The output should include 2 lines for hme0 (physical and virtual IP addresses), looking like:

hme0: flags=1000843 mtu 1500 index 12
inet 192.200.100.1 netmask ffffff00 broadcast 192.200.100.255
ether 8:0:20:b0:a8:1f
hme0:1: flags=1000843 mtu 1500 index 12
inet 192.200.99.5 netmask ffffff00 broadcast 192.200.99.255

3. In a shell window, launch a command to monitor the state of the nodes, service groups and resources:
# hastatus

4. In another shell window, launch a command to monitor the main VCS log file:
# tail -f /var/VRTSvcs/log/engine_A.log

5. Pull the cable from hme0 or switch off the hub or switch this NIC is attached to (as long as the other NIC in the Device attribute is not connected to this same network component).

The MultiNICA agent will perform the NIC failover (from hme0 to qfe0) after a 2-3 minute interval. This delay occurs because the MultiNICA agent tests the failed NIC several times before doing the NIC failover.

6. Check the engine_A.log log file to see the failover occurring. You should see lines like:

TAG_C 2002/01/31 09:26:27 (node1) VCS:136502:monitor:MNIC:MultiNICA: Device hme0 FAILED
TAG_C 2002/01/31 09:26:27 (node1) VCS:136503:monitor:MNIC:MultiNICA: Acquired a WRITE Lock
TAG_C 2002/01/31 09:26:27 (node1) VCS:136504:monitor:MNIC:MultiNICA: Bringing down IP addresses
TAG_C 2002/01/31 09:26:27 (node1) VCS:136505:monitor:MNIC:MultiNICA: Trying to online Device qfe0
TAG_C 2002/01/31 09:26:29 (node1) VCS:136506:monitor:MNIC:MultiNICA: Sleeping 5 seconds
TAG_C 2002/01/31 09:26:34 (node1) VCS:136507:monitor:MNIC:MultiNICA: Pinging Broadcast address 192.200.100.255 on Device qfe0, iteration 1
TAG_C 2002/01/31 09:26:34 (node1) VCS:136514:monitor:MNIC:MultiNICA: Migrated to Device qfe0
TAG_C 2002/01/31 09:26:34 (node1) VCS:136515:monitor:MNIC:MultiNICA: Releasing Lock

7. In the meantime, verify the hastatus output hasn't changed. The test_multiNIC should still be ONLINE on node1; no resources were affected. That is the expected behavior.

Incorrect way to test the NIC failover:

Some people can be tempted to unplumb the NIC via a command line to test the MultiNICA failover.

If you unplumb a NIC with a command line (for example: "ifconfig hme0 down unplumb"), VCS will notice that it hasn't put both MultiNICA and IPMultiNIC resources down itself. In other words, these resources will become OFFLINE, not being initiated by the agent monitor procedures.

The engine_A.log log files shows:

TAG_D 2002/01/31 09:32:53 (node1) VCS:13067:Agent is calling clean for resource(IPMNIC) because the resource became OFFLINE unexpectedly, on its own.
TAG_D 2002/01/31 09:32:54 (node1) VCS:13068:Resource(IPMNIC) - clean completed successfully.
TAG_E 2002/01/31 09:32:55 VCS:10307:Resource IPMNIC (Owner: unknown Group: test_multiNIC) is offline on node1
(Not initiated by VCS.)

Then the IPMultiNIC resource gets faulted on node1 and, as it is a critical resource, the whole service group will failover. That is obviously not the expected behavior.

service group in main.cf containing MultiNICA and IPMultiNIC resources is shown below:

group mnic_test ( SystemList = { csvcs3 = 1, csvcs4 } )

IPMultiNIC mIP (
Address = "166.98.21.173"
MultiNICResName = mnic
)

MultiNICA mnic (
Device @csvcs3 = { qfe2 = "166.98.21.197", qfe3 = "166.98.21.197" }
Device @csvcs4 = { qfe2 = "166.98.21.198", qfe3 = "166.98.21.198" }
ArpDelay = 5
IfconfigTwice = 1
PingOptimize = 0
Handshake-Interval = 10
)

mIP requires mnic

// resource dependency tree
//
// group mnic_test
// {
// IPMultiNIC mIP
// MultiNICA mnic
// }

Plumb qfe2 on each machine with its respective base IPs. In the example above, the base IP on csvcs3 is 166.98.21.197, while that on csvcs4 is 166.98.21.198. The virtual IP is 166.98.21.173 as shown in the IPMultiNIC resource. Then create the mnic_test group as shown above.

In the sample configuration given above, the following additional attributes are set for MultiNICA resource on a very active network. ArpDelay set to 5 secs, to induce 5 second sleep between configuring an interface and sending out a broadcast to inform routers about base IP address. Default is 1 second. IfconfigTwice is set to cause the IP address to be plumbed up twice, using an ifconfig up-down-up sequence. This increases the probability of gratuitous arps (local broadcast) reaching the clients. Default is 0 (not set).

The following attributes for MultiNICA can be set to decrease the agent detection/failover time :

PingOptimize set to 0 to perform a broadcast ping each monitor cycle and detect the inactive interface within the cycle. Default value of 1 requires 2 monitor cycles.
Handshake-Interval set to the least value of 10, from the default value of 90. This makes the agent attempt 1 time (as opposed to 9 times from default), either to ping a host (from the NetworkHosts attribute) or to ping the default broadcast address depending on the attribute configured, when it fails over to a new NIC.


Also it is to be noted that setting PingOptimize and Handshake-Interval to the above values would certainly improve the response time, but also would increase the chance for spurious failovers. So, essentially it is a tradeoff between performance and reliability.


To test the configuration pull the cable from qfe2 on csvcs3. It will fail over to qfe3 along with the virtual IP on the first node. Then pull the cable off qfe3. After a 2-3 minute interval, the mIP resource on csvcs3 will become faulted and the whole mnic_test group will go online on csvcs4. This delay occurs because the MultiNICA agent tests the NIC several times before marking the resource offline.


No comments:

Post a Comment