As I mentioned a little bit ago I had a network upgrade. Took place Tuesday night. What we were doing should not have been a big deal. In fact I had done this before on a smaller scale. My coworker Rob and I were upgrading a network with some newer equipment and moving routing down to the Distribution layer. Maybe I should attempt to explain some of this…
Typically in a network design you have 3 layers for host traffic, Access, Distribution, and Core. The Access layer is the host or client layer. Devices at this layer are typically switches (in older networks you might see hubs) and access points and give hosts a layer two connection to the network. Layer two connections in a network use the MAC address of devices to send traffic along.
The Distribution layer interconnects access layers together. It can be used for physical separation of access layers or to route traffic between access layers to each other and the Core layer. Typically devices here are switches and have a layer two connection up to the Core layer. But more and more networks are now creating layer 3 connections at the distribution layer because of the advantages it offers. This is what we were doing with our upgrade.
The Core layer is used to route traffic around the higher levels of the network and typically to the outbound pipes to the Internet. Devices at this layer are typically routers, but Layer 3 switches usually work great to the point where you need special services like VPN tunnels and outbound connections. All traffic at the Core is typically layer 3 when it leaves. This means traffic is traversing devices via IP addresses.
So when I say we are moving routing to the Distribution layer, it means devices at this layer will route traffic up to the Core devices. There are many advantages to doing this. One of the big ones is segmenting VLANs. VLANs (virtual LANs) is a networking tool that allows us to split up traffic on switches. What you can do is create a network for any traffic going through a switch that you want. Then apply this network or VLAN to any switchport you want. Traffic within a VLAN stays in that VLAN thus giving us our segmentation.
So in a scenario where we have multiple sites in a geographic region we can create layer 3 Distribution layers at the edge of each site to do routing between them and the core. This means that the VLAN traffic at each site stays local. So VLAN 50 traffic that might exist at site 1 will stay local to that site. When the traffic from VLAN 50 leaves and goes out the layer 3 Distribution device, it is routed. So at this point the VLAN this traffic came from is irrelevant.
What we get from this is that traffic at site 1 will remain at site 1 unless it needs to go out to another network or VLAN that doesn’t reside at site 1. Whereas if we had a layer 2 device at the distribution layer and maybe 10 VLANs at site 1, clients in 2 different VLANs would have to have their traffic leave site 1 for them to communicate with each other up to the Core layer where the gateway of each VLAN would exist. Even if they sat right next to each other. This is why having a layer 3 device at the Distribution layer can be a good thing because the gateway for VLANs exists closer to the point of origin for traffic.
So back to the story…
I took down our main campus. I was connected directly to our cores at the World Wide Headquarters and connected to the wireless infrastructure. I applied my scripts and as soon as I did my wireless dropped. I thought, “Hmmm that’s strange.” I called Rob who was at our Tech Center site doing some work in our closet there. I asked him if he had finished hooking up the Distribution switches. He said he had but we had no connectivity from me to him.
Once we started looking into it I got a call from Ops (operations). There is only one thing I think of when my work cell phone rings, “SHIT!!” Reason being, the only folks that call it are the ones from Ops, so when it rings I have the Pavlovian response of, “It must be Ops!” Sure enough it was Ops and our Enterprise wide security team was down. This is a problem.
After looking into we found I had two lines of code that took things down. Normally those two lines would not have been a big deal. This time it was a big deal because it was blocking the VLANs that were used to pass routing information between the cores and our major routers. Oops! This meant those two lines took down our ENTIRE World Wide Headquarters campus and security for the entire Enterprise.
Well lessons learned and after an slap on the wrist the next day I have learned my lesson and know what I did wrong. What a night!