Archive for November, 2010

RHEL/SLES default routes with multiple network interfaces (DHCP)

November 4, 2010

Most enterprise NICs nowadays come with 2 ports… including the one that my company makes. On our test systems, we use 1 port on the onboard/Intel NIC for the test system to stay connected to the world, while the other NIC may be on/off/unstable depending on the test.

We prefer DHCP by default for all IPs, and the onboard NIC to use eth[01] while our test/development NIC uses eth[23]. All DHCP transactions provide a gateway address.

However, different operating systems have different behavior when determining which route to set up as default, when there are multiple interfaces. RHEL tends to use the last activated interface. SLES seems to use the first one, BUT on our older set of chassis, the DHCP transaction for our onboard NIC (with the desired default route) doesn’t complete “in a timely manner” (40 seconds), gets backgrounded, and the network configurator uses the next “working interface” for its default gateway.

For RHEL, we simply specify GATEWAYDEV=eth0 in /etc/sysconfig/network and that solved our issues, in a simple manner, while letting us keep DHCP. No DEFROUTE= in each /etc/sysconfig/network-scripts/ifcfg-eth* file or anything.

Unfortunately, such a simple solution doesn’t apply to SLES. For SLES 10, I set WAIT_FOR_INTERFACES="60" and added the “slow” NIC to MANDATORY_INTERFACES, both in /etc/sysconfig/network/config. (I also set FIREWALL="no" for good measure, but that is likely irrelevant.) Enabling DEBUG in ./config and ./dhcp didn’t seem to help at all. For SLES 11, I do the same thing, though the style of entry in MANDATORY_INTERFACES is a bit different, and it doesn’t really … seem to work well.

At the end of the day, spanning tree on the switch was causing link negotiation to take like 30 seconds or something, which is why we saw the DHCPDISCOVER coming down the OS stack but not out the wire until 30+ seconds later. It really shouldn’t, and this is a hacked out solution, but meh…