Another project complete! This one stretched one of my weakest skills - networking.

The Problem:

So, every server previously had two network connections - one dedicated to the management network, and one dedicated to the host network. Both of these connections were 10Gbit connections, via SFP+ fiber transceivers. However, if a transceiver, cable, NIC or even the entire switch crapped out, that host was toast, because both connections needed to stay up at all times. This scenario happened twice in recent memory:

  • Unexpected switch reboot causing complete loss of connectivity (due to me forgetting to write the running switch config to memory)
  • Unexpected NIC failure causing host isolation on ESXi2.

The way to fix this is to ensure every single network component is redundant, and with the redundant LAN project now complete, that's exactly what we have. Every host has four 10Gbit SFP+ ports connected like so:

  • NIC1: Management Net A
  • NIC2: Host Net A
  • NIC3: Management Net B
  • NIC4: Host Net B

The A NICs go to the primary switch, and the B NICs go to the secondary switch, so every host has two paths to my network now. So how did we accomplish this?

The Solution:

If you recall from the Lain.la v2.3 diagram (in which the only part not completed now is redundant WAN), you can see sorta what this looks like visually:

This shows that each hypervisor has at least four connections, and each storage server has at least two, split across the two ICX 6610 switches I own, which are uplinked to each other via 40Gbit QSFP+ stack cables (beefy!). What we had before was one switch only. I'll go over some of the parts and config I did.

Parts:

  • Brocade ICX-6610-48P (Original primary switch - The P means PoE but I don't use it)
  • Brocade ICX-6610-48 (This is the secondary switch I bought for the project)
  • Brocade RPS-16E PSUs (PoE PSUs run quieter and more efficiently due to their MASSIVE capacity of 1kW)
  • Brocade E40G-QSFP-4SFP-AOC-1001 QSFP+ to SFP+ Breakout Cables
  • Brocade 10G-SFPP-SR 57-0000075-01 SFP+ Transceivers
  • Generic OM3 Fiber Patch Cables
  • Broadcom 57810 Dual SFP+ NICs (FYI: Loud fan. I don't recommend - I am currently testing Intel X520-DA2 fanless dual port NICs instead)
  • 40G QSFP+ Direct Attach Cables from FS.com

A quick note: The ICX-6610-48 series switches require licensing for the 10Gbit SFP+ ports. Contact me if you need assistance with this, I know a good "reseller" :^)

The switches are NOT set up in a stack, rather I just use a static link aggregation for the two QSFP+ ports. See here:

That connects the two with redundant links. One transceiver dies, oh well. There's another. Two paths for everything. Outside of that, that's all that's needed for the switch side besides replicating config.

Side note: Look how COOL this web interface panel looks. Too bad it's basically useless AND wrong.

For the redundancy on the server side, I used FreeBSD LAGGs for TrueNAS, and for the hypervisors, vSwitch Active/Passive uplinks on the vSwitch configuration, like so:

So this makes my network much more resilient to a single component failure. Hooray! Hope to have more cool things soon.

-7666