Sunday, March 1, 2015

Going #VIRL

I've been playing with a new tool in my networking toolkit lately: Cisco VIRL. It stands for Virtual Internet Routing Lab, and it's network simulation software that runs Cisco VM's of several networking platforms. There are plenty of other simulation (often emulation) solutions that exist for modeling up networks and are quite helpful for studying, but VIRL brings some key differences to the table.

First, it's designed and published by Cisco, for Cisco. It was formerly an internal testing tool, and it is now available to the public at http://virl.cisco.com. This means that it is running true, current Cisco operating systems through virtualization, as opposed to emulating older versions of routing hardware. While the VIRL Personal Edition software is community-supported, there is an enterprise branch called Cisco Modeling Labs that is fully TAC supported. How does this impact VIRL? We'll have to see, but the fact that a similar software suite is targeted for enterprise customers may mean that VIRL could benefit from the work being done for the enterprise.

Second, it's able to virtualize images from multiple Cisco operating systems. Most popular of images, of course, is the virtual IOS. This is a current version of IOS (at time of posting, in 15.4 train) so testing relatively new software features should not be an issue. However, what if there is a need to simulate an IOS-XE router like the ASR 1000? That's included in VIRL as well. Beyond that, even virtualizing IOS-XR is supported for the routers that are a bit more carrier focused. I'd be doing a disservice if I did not mention VIRL can run virtualized NX-OS nodes too. I'm quite excited about the ability to fire up labs simulating Nexus 7000's running along-side IOS routers.

Third, there are some pretty fantastic features in the VIRL software that make it really nice to use. First, it's very easy to start creating a topology by dropping routers into a visual network map and then drawing connections between them. After quickly connecting some boxes together though, there are a lot of additional baseline configurations that need to be done when bringing up a simulation network. This requires some thinking about and then implementing IP addressing, enabling services, even complex steps like setting up a routing protocol like EIGRP or even BGP.



This is where a feature set called AutoNetkit comes into play. VIRL has AutoNetkit support, meaning that through a slick menu system of variable inputs, it is fairly elementary to set up a network with appropriate IP schemas, loopbacks, point-to-point routed connections, and even have it start up already configured to run a dynamic routing protocol. This translates to consistent configuration deployment in the lab and into time savings to lab up the important stuff. VIRL will also provide multi-layer visualization of the network based on the AutoNetkit configurations. It gives the ability to change the network topology views from physical topology all the way to show OSPF area configurations or BGP peering relationships. 

Performance-wise, I have VIRL running on my Macbook Pro, and I've been running test simulations with five or so routers and it's been working fine. I'll continue to mock up some scenarios and try to find the limits that I can do running this on my laptop.

I've just started playing with this tool set, so I know that I have a lot more to learn. I'm very excited to have another tool set available with which to learn and practice. 

Thursday, August 14, 2014

And we're back.

After a very long intermission, I have decided to re-launch this blog. A lot has changed this year, including my employment. Now that I am settled in, and I'm back working down a certification path, I want to continue blogging.

I've added a disclaimer to the site as well. Please note that all content on this blog reflects my own personal views and opinions, and is not representative of my employer.

I've started on the path to achieve my CCIE Data Center certification. This will make a significant impact on the content and topics I'll be posting about, so please be aware.

I want to welcome you back, and I look forward to interacting with you through comments on my future posts!

Thursday, November 14, 2013

Quickly calculating subnets in your head

Subnetting is often a tricky stumbling point for those who are either just getting started in networking, or even for long-time admins who have not had the need to carve different-sized subnets often. Whenever I see subnetting taught, it usually involves binary math, logic gates, and lots of paper space writing down ones and zeros. When asking people who seem to just have it figured out, I often hear "if you do it enough, it just becomes easy."

I don't think any of this is wrong. When learning subnetting, learning it on paper with binary is a great way to understand the theory. I am not, by any means, advocating that this should not be the foundation upon which a network professional builds. That being said, I also don't think many network folks who are carving subnets on a daily basis are imagining all the ones and zeros, AND'ing to get the network ID and NOR'ing to get the host ID. Part of subnetting becoming easier is finding a way to easily do the math in your head.

Some may just use a subnet calculator tool, as there are many available online. However, I feel that relying on such a tool for anything other than sanity-checking can be time-consuming. When building out or checking networks, I can't imagine having to stop and look up where my subnet boundaries are, or how many hosts (or networks) I should expect to be able to fit inside.

After spending time thinking about a real answer to how I can subnet in my head, I think I've got an answer. I've found that over time, I use the same basic method of calculating subnets in my head, and it works well enough for me. I am sure this is one of those personal preferences -- but hopefully my experience will be helpful to others. Also, the old adage of "if you do it enough, it becomes easy" still applies! Even following mental shortcuts like this require practice before it is easy and repeatable.

Step 1: Think CIDR
I find that it's easier for me to think in CIDR notation, and then convert after-the-fact into a traditional dotted decimal subnet mask. I'm not sure why, but this just makes it easier in my head. CIDR notation is the forward-slash followed by number of "high" bits in the subnet mask. For instance, the typical subnet for most home networking gear is 192.168.1.0/24. This means that the subnet mask is 255.255.255.0 (or in binary: 11111111.11111111.11111111.00000000). If you count the ones, there are 24 of them.

Step 2: Commit the main boundaries to memory
In the example above, we used /24, which I'll call a main boundary. I consider main boundaries /8, /16, /24, and I suppose /32. Using these boundaries as reference points can help to easily move up or down from a subnetting perspective. 

To explore the concept, let's look at the number of IP's per subnet, and number of subnets per /24, for each longer mask:

/25: 128 IP's, 2 networks per /24
/26: 64 IP's, 4 networks per /24
/27: 32 IP's, 8 networks per /24
/28: 16 IP's, 16 networks per /24
/29: 8 IP's, 32 networks per /24
/30: 4 IP's, 64 networks per /24
/31: 2 IP's, 128 networks per /24
/32: 1 IP, 256 hosts* per /24 (the /32 represents an individual host - called the host mask - not a subnet)

Note that as the subnet got smaller (each one is half the size of the prior), twice as many could fit into the /24.

As shown above, I find it's easier to only "think" one level up or down when possible. What I mean by this, is that it is easier to stay contextual to the main boundary either above or below. For instance, when I think of a /24, I think about it as the fact that each /24 is comprised of 256 IP's (not all assignable, of course!). In the same respect, I think about a /16 as that it can hold 256 /24's. At that point, if someone were to ask what subnet size is needed to accommodate 900 /24's, I can divide 900 by 256 to find that I need at least 3, but less than four /16's to cover that.

Using this technique, it becomes easier to at least work within some large windows of subnetting to get us at least in the right ballpark.

Step 3: Split the difference
If you look at the example above showing the number of IP's and networks (per /24) in the CIDR /25-/32, does any one of those look easier to remember than the others? To me, /28 sticks right out. Halfway between /24 and /32, it's a good mid-point. Importantly, also note that /28 has the same number of hosts as it has networks: 16.

The same is true for the other midpoints: /20, /12, and /8; each contain 16 subnets of their next longer boundary, and each will need 16 subnets to fill its next shorter boundary. In other words, a /20 contains 16 /24's, and there are 16 /20's in a /16. This makes it relatively simple to move up or down from anywhere, without having to keep track of too much in your head.

Step 4: Shift up or down as needed
Now that we have a solid number of well-understood reference points to work from, subnetting can be done in much more digestible chunks.

Take the example earlier, if you were asked to provide a supernet, possibly for route summarization, that one could carve out 900 /24's starting above 10.25.0.0 to assign as end-user subnets. We know that there are 256 /24's in a single /16. By dividing 900 by 256, we know the number is "3 point something"; in other words, we need to subnet up to the next bit boundary after 3 (in context of a /16) to accommodate this. Thanks to the simplicity of binary, we can count upwards til we pass 3: 1 /16 (/16), 2 /16's (/15), 4 /16's (/14). In order to have over 3 /16's worth of /24's, we had to move to a /14, which gives us 4 * 256 /24's, or 1024 /24's.

Alternatively, another way to solve this would be to again figure out that at least 3 /16's are necessary, so immediately jump to next memorized CIDR - /12. Knowing that a /12 will contain 16 different /16's, we can then work through longer CIDR's to hit one that holds only four /16's. A /13 holds eight /16's, and a /14 holds four /16's. Bingo!

Step 5: Figure out network ID and subnet mask
Following the same example, we now know that we need to assign a /14 somewhere after 10.25.0.0. We know that we could fit four /16's in a /14, so the bit boundaries for /14's are going to be in multiple of four. Starting with 10.0.0.0, the next /14 would start at 10.4.0.0, then 10.8.0.0, and so on.

Thinking of our multiples of four, we know that 10.24.0.0 is a multiple of 4, but we need to start after 10.25.0.0. This means that our first /14 after 10.25.0.0 must be 10.28.0.0/14

Now we know our network ID is 10.28.0.0, and spans through 10.31.255.255. However, we still need that dotted decimal subnet mask! 

Remember that a /16 is 255.255.0.0 (16 high bits). Taking away one high bit would leave 255.254.0.0 (/15), and then one more would leave 255.252.0.0 (/14). Now, another mental shortcut: since we're being contextual to the /16 -- knowing that four /16's fit into a /14, we can glean the subnet mask by subtracting 4 from 256 in the second octet (the octet we are cutting into by taking away "ones"): 255.(256-4).0.0 = 255.252.0.0. Play with this - it works. For instance, a /20 holds 16 /24's -- so 255.255.(256-16).0 = 255.255.240.0, which is the same as a /20. Neat, eh?

To summarize now, we know that to provide a supernet of 900 /24's, starting after 10.25.0.0, we will provide:

CIDR: 10.28.0.0/14
Network ID: 10.28.0.0
Subnet mask: 255.252.0.0

Step 6: Practice!
Writing all of this out, it it obvious that this is not a simple easy answer to mental subnetting. However, with practice, these rules will help to make subnetting a manageable task that can be done without the need for calculators or tools. Again, this is not a substitute for understanding the theory of how to subnet; it is simply a way that I find it easier to run calculations without having to scratch out a bunch of notes on paper.

Whether you follow this method of making the math more bite-sized, or another method, it is important to practice it thoroughly. When the time comes to complete subnetting tasks, being able to do it in your head on the fly will not only impress others but will also save you time.

Thursday, August 1, 2013

Replacing a failed VSS supervisor

Several years ago now, Cisco introduced the Catalyst 6500 "Virtual Switching System" (VSS), and it has become a very popular deployment model. VSS allows two separate Cisco 6500 chassis to be paired together and managed as one.

I'm a fan of this technology, as it allows for the reduction of a loop topology in a redundant layer-2 network. However, there may come a time when a supervisor module will fail, and hopefully the impact is minimal! I've seen times when a SUP720 has gone belly-up with minimal impact to the users, and other times when the impact was less than graceful. Luckily, there have been more of the former than the latter. While on the subject of failures in VSS -- I'll take a second to encourage you to make sure that some method of dual-active detection is in place!

Cisco has a published guide to replacing a failed supervisor in a VSS system, and I would definitely recommend reading it over prior to trying to replace a failed supervisor. It's entitled "Replace Supervisor Module in Cisco Catalyst 6500 Virtual Switching System 1440" (if the link eventually breaks, hopefully your favorite search engine will find it based on the title) so please read this first before proceeding. After doing this enough, I came up with a modified procedure that I feel covers all of the bases and can make this stressful process go a bit more smoothly.

The reason I have written up a different version is that there are a certain aspects of the official guide that I feel could be more conservative from a risk standpoint. For instance, it recommends connecting the VSL's prior to having the failed supervisor ready to boot in VSS mode. While this may be harmless, I'd rather avoid the situation and risk a stale configuration potentially impacting the active VSS member.

Without further ado, these are the steps I have followed to replace failed supervisors in 6500 VSS. I do welcome any comments or questions.

Prepare Your Laptop

  1. Copy IOS binary and running-configuration from active VSS; save to laptop
  2. Check the active VSS chassis' switch number: 

  3. LAB-VSS-6500-0-15#switch read switch_num local
    Read SWITCH_NUMBER from Active rommon is 1
    LAB-VSS-6500-0-15#


  4. Set up static IP (10.1.1.1/30) on laptop
  5. Ensure you have some sort of file transfer server available on the laptop (SCP, FTP, TFTP, HTTP, etc.)

Prepare the New Supervisor

  1. Procure spare/new Sup720 supervisor (VS-S720-10G or such model)
  2. Remove any transceivers or cables from failed supervisor, then removed failed supervisor
  3. Pull any other linecards in the chassis out a few inches -- voila -- it's a "spare chassis" now!
  4. Insert spare/new supervisor and connect laptop to console port
  5. Connect laptop to copper interface on supervisor (i.e. Gi1/3) and then configure the port as a routed port with 10.1.1.2/30 as the IP.
  6. Set the VSS switch number to the opposite of the number from the active chassis (either 1 or 2, remember to make it the opposite!) - switch set switch_num 2
  7. Validate the setting via switch read switch_num local
  8. Check version of code the switch has on it, on the same filesystem as the active has its code. i.e. dir sup-bootdisk:
  9. If necessary, copy IOS binary from laptop to filesystem, then validate with "verify" command
  10. Ensure configuration register is set to 0x2102 with: show ver | inc register (if not, set it in config mode and then save config)
  11. Copy over the active supervisor's running-configuration (already saved to your laptop) to the new sup's startup-configuration
  12. Confirm show bootvar is correct - pay attention to confreg and boot image
  13. Power down the chassis!

Bring up the Chassis

  1. Slide all linecards back in; insert transceivers and cables as appropriate to new supervisor. Ensure VSL's are connected!
  2. Power up the chassis
  3. On active chassis, issue these commands until satisfied everything has come back as expected
    • show switch virtual redundancy // watch for 2nd chassis to come up and enter SSO
    • show switch virtual link // validate the VSL's come up alright
    • show switch virtual dual-active pagp // check to ensure dual-active detection is enabled. If using something other than enhanced pagp for this, substitute command as appropriate
    • show logging // ensure VSS is coming up and nothing else is going wrong
    • show etherchannel summary // make sure those multichassis etherchannels fill back up
This isn't a short procedure by any means, but it is relatively straightforward. Again, I have linked the official Cisco doc for replacing a failed supervisor -- please read it! My steps here are, of course, "take at your own risk." That being said, this procedure has worked well for me on several occasions.

Thursday, July 25, 2013

Determining exact route when using ECMP

Many enterprise Cisco networks take advantage of multiple, redundant routed paths between routers or layer-3 switches. Certain routing protocols make it easy to not only achieve redundancy, but also higher throughput by utilizing both links via equal cost multipath (ECMP). This means that the routing table contains multiple entries, with same administrative distance and same metric, for the same network. The router then balances traffic between the two links using CEF (Cisco Express Forwarding). OSPF, paired with a symmetrical design, lends itself well to ECMP links, whereas EIGRP may require variance to be configured in order to allow for ECMP.

This works great, but because of this, the normal troubleshooting commands to find the deterministic path of a packet are somewhat ambiguous. Traditionally in a non-ECMP environment, even if there are multiple routes to a destination, the best route wins and is inserted into the routing table. Therefore, a simple "show ip route" can give you the path over which a particular flow is leaving the router. However, when there are more than one path in the routing table for a particular network, "show ip route" will show both. If you need to know for sure which link is being used, there is a way to do this.

In IOS, the way to determine the exact route a particular flow is taking, is to utilize the command "show ip cef exact-route". This command will take a source IP, destination IP, and give the outgoing interface from which the packet will leave. Below is an example:

TestLab-6500#show ip route 172.16.0.5
Routing entry for 172.16.0.5/32
  Known via "ospf 1", distance 110, metric 21, type intra area
  Last update from 10.0.35.74 on TenGigabitEthernet4/1, 11:16:48 ago
  Routing Descriptor Blocks:
  * 10.0.35.74, from 172.16.0.5, 11:16:48 ago, via TenGigabitEthernet1/1
      Route metric is 21, traffic share count is 1
    10.0.35.72, from 172.16.0.5, 11:16:48 ago, via TenGigabitEthernet2/1
      Route metric is 21, traffic share count is 1

TestLab-6500#
TestLab-6500#show ip cef exact-route 10.0.0.10 172.16.0.5
10.0.0.10 -> 172.16.0.5 => IP adj out of TenGigabitEthernet1/1, addr 10.0.35.74
TestLab-6500#


For those of you who have found the Nexus line to be fitting for L3 switching, the same basic functionality is there in NX-OS. The command syntax has changed slightly -- "show routing hash":

N7K-R1# show ip route 172.16.0.5
IP Route Table for VRF "default"
'*' denotes best ucast next-hop
'**' denotes best mcast next-hop
'[x/y]' denotes [preference/metric]

172.16.0.5/32, ubest/mbest: 2/0
    *via 10.0.4.121, Eth2/1, [110/31], 5d05h, ospf-1, intra
    *via 10.0.8.121, Eth1/1, [110/31], 15w4d, ospf-1, intra
N7K-R1#
N7K-R1#
N7K-R1# show routing hash 10.0.0.10 172.16.0.5
Load-share parameters used for software forwarding:
load-share mode: address source-destination port source-destination
Universal-id seed: 0x9617c10
Hash for VRF "default" resulting hash: 0x01 path '>'

172.16.0.5/32 unicast forwarding path(s) 2
  *via 10.0.4.121%Ethernet2/1
> *via 10.0.8.121%Ethernet1/1

Route:
N7K-R1#


While most of the time, it's acceptable to just know that it is taking one of several redundant paths, sometimes it is necessary to know the specific path a packet is taking. Hopefully this will serve as a quick reference during such events.

Tuesday, July 9, 2013

Linux software for networking geeks

Sometimes I feel that without the ability to run Linux, my work life would be much harder. Not that I use Linux 100% of the time or anything, but there are just certain things that are simply much more efficient in Linux.

I may be biased, but I believe a number of tools stand out when it comes to network troubleshooting. While I am sure some of these are available in Windows as well, I know they're available for Linux. Heck, some of what I use most are straight up bash shell or common command-line tools. The following list is in no particular order and is definitely not all-inclusive. It's just the list of tools that are top of mind for me right now.

Also, I'm hoping that this can serve simply as an introduction/primer to these different tools. If these sound interesting to you, there are plenty of great resources online that dive deep into each of these tools, or you can take some time to experiment and play with them. Furthermore, I'd definitely appreciate others sharing their favorite network-centric Linux software in the comments -- I love to learn what else is out there!

GNU Screen (or tmux, for those screen haters)

There are very few programs that have made as big of an impact on my workflow as screen. Simply put, it allows you to run multiple persistent bash shells that can be attached and detached at will. This means that I can SSH to a Linux server from my desktop, and once connected, start screen. I can create several windows that I can switch between, without having to open a bunch of different terminal windows (or putty windows, etc).

When I get distracted from the task at hand, I feel like it is hard to get back into the same mindset. Screen does a great job of jump-starting you back into your work, as you can easily pick up right from where you left off. Let's say I'm in the midst of building a configuration, so I've got vim open in one window and I'm ssh'd to a lab router from another. I could detach from my screen session, close my SSH connection, go to a meeting, sit down at another PC, SSH back in and re-attach to screen. My vim would still be open in one window and I'd still have my router connection in the other (assuming I don't set timeouts on my lab gear).

There are other great possibilities that come from having persistent, always running sessions available via screen. Anyone still use instant messaging or IRC? There are several text-based IM/IRC clients that run fine within screen. This means that you can always be 'online' in some way, shape, or form, and then attach in every so often to catch any missed messages and/or engage in conversation.

Another smaller corner-case use of screen (sorry tmux folks, I don't believe this is supported) -- terminal emulator via serial. Sure, there's minicom and other useful tools, but screen has this function built right in. I can hop on a router console via my USB->Serial adapter by simply running 'screen /dev/ttyUSB0' on my laptop.

The following table contains the basics of using screen. I highly recommend reading man pages or some more detailed documentation on the Internet, but this should at least get you started. As with anything else, it may take some time to get used to, but it is absolutely worth it.

GNU Screen cheat sheet

// Launch screen into its default shell
user@host# screen

// Start a program in screen, i.e. irssi
user@host# screen irssi

// Check for active screen session
user@host# screen -list

// Reattach screen (detach first if needed)
user@host# screen -d -r

// Multiuser attach (aka attach without detaching other instance)
user@host# screen -x

// Multiuser attach (but detach other instance first)
user@host# screen -dx

// Control from within screen
[Ctrl-a + d] Detach from screen
[Ctrl-a + ?] Display key bindings
[Ctrl-a + c] Create new window
[Ctrl-a + A] Rename current window
[Ctrl-a + Ctrl-a] Switch to most recent window 
[Ctrl-a + #] (# = 1, 2, 3 etc) Switch to first, second, etc window
[Ctrl-a + p] Switch to previous window
[Ctrl-a + n] Switch to next window
[Ctrl-a + "] Bring up menu of windows to choose from
[Ctrl-a + k] Kill current window
[Ctrl-a + Esc] Enter copy mode (read up on this before using)



mtr

First called "Matt's Traceroute" (not me!) and now "My Traceroute" -- MTR is a super useful traceroute tool that beats the pants off the traditional 'traceroute/tracert' tools. Instead of just tracing the route and reporting three response times per hop, it continually pings each hop to measure latency and loss. While traceroutes are not always super accurate due to firewall filtering, this tool makes the best of it. Since mtr uses ICMP as opposed to UDP like the standard 'traceroute' tool, it often seems to return better results. Thanks to +Mike Neir for the reminder on that last bit!

blogger@bloggy:~$ mtr -n 8.8.8.8

                              My traceroute  [v0.80]
bloggy (0.0.0.0)                                       Wed Jul  3 04:25:50 2013
Keys:  Help   Display mode   Restart statistics   Order of fields   quit
                                       Packets               Pings
 Host                                Loss%   Snt   Last   Avg  Best  Wrst StDev
 1. 111.111.111.111                   0.0%    10    0.0   0.0   0.0   0.0   0.0
 2. 111.111.111.112                   0.0%    10    1.0   0.6   0.5   1.0   0.2
 3. 111.111.111.113                   0.0%    10    0.2   0.2   0.2   0.2   0.0
 4. 111.111.111.114                   0.0%    10   24.8  19.6  13.3  24.8   4.1
 5. 207.88.14.194                    55.6%    10   12.8  12.8  12.8  13.0   0.1
 6. 216.156.72.30                     0.0%    10   80.7  72.5  55.0  86.5  11.8
 7. 209.85.254.128                    0.0%    10   13.5  13.7  13.0  17.3   1.3
 8. 72.14.237.130                     0.0%    10   13.2  13.3  13.1  13.5   0.1
 9. 72.14.238.106                     0.0%    10   29.6  29.8  28.2  41.8   4.2
10. 216.239.46.191                    0.0%    10   39.4  38.0  25.9  56.3  11.8
11. ???
12. 8.8.8.8                           0.0%     9   26.0  26.0  25.9  26.0   0.0


hping3

The next tool I want to mention is somewhat of a swiss army knife of ping tools. It can operate in ICMP, TCP, UDP modes, as well as a fairly functional port scanning mode. It also gives plenty of customization options with regard to flags, spoofing addresses, etc. This is one of the more common tools I use in lab, but it can often come in handy in day-to-day troubleshooting as well. The ability to spoof source addresses makes it an invaluable tool to test firewall rules.

hping3 can also be used to test QoS policies, if policies match on DSCP values. The key to this, of course, is that the trust boundary needs to move down to the port. Otherwise the markings will be rewritten to 0 on ingress. See an example below of how to test QoS policy matching on DSCP value EF, typically used for VoIP payload. Along with the hping syntax, we'll also do some command line binary/hex/decimal conversion.

// hping3 takes ToS, not DSCP
// DSCP = 6 most significant bits of 8-bit ToS field
// Convert DSCP 46 to binary
blogger@bloggy:~# echo "obase=2; 46" | bc
101110

// Pad the DSCP out to find ToS value
blogger@bloggy:~# echo "ibase=2; 10111000" | bc
184

// Convert the ToS value to hex for use with hping3
blogger@bloggy:~# echo "obase=16; 184" | bc
B8

// Set up tshark to capture and validate DSCP from hping3
blogger@bloggy:~# tshark -c1 -i wlan0 -f 'host 8.8.8.8' -T fields -e ip.dsfield.dscp 2>/dev/null | awk '{ print strtonum($0) }'

// In another terminal, send packet with DSCP 46
blogger@bloggy:~# hping3 --udp -c 1 -o b8 -p 53 8.8.8.8 
HPING 8.8.8.8 (wlan0 8.8.8.8): udp mode set, 28 headers + 0 data bytes

--- 8.8.8.8 hping statistic ---

1 packets tramitted, 0 packets received, 100% packet loss
round-trip min/avg/max = 0.0/0.0/0.0 ms

// Check other terminal to see if captured DSCP value is 46
blogger@bloggy:~# tshark -c1 -i wlan0 -f 'host 8.8.8.8' -T fields -e ip.dsfield.dscp 2>/dev/null | awk '{ print strtonum($0) }' 
46


tcpdump / Wireshark / tshark

The next set of tools are a must-have in any networking toolkit. These three tools all allow for the capture and displaying of network traces. I find myself using tcpdump most often from the command-line to capture packets to a file, tshark if I need to search/manipulate/trigger off the results, and Wireshark as a general capture or diagnostic tool. Wireshark has a nice graphical user interface to assist in capturing, displaying, filtering, and analyzing results of a trace. It's probably the best place to start if looking to learn how to really dig into network conversations. Of course, it is also great for advanced analysis of captures.

tcpdump is tried and true, and it is often included in the base installation of most Linux distributions. The prevalence of tcpdump makes it extremely convenient to either snag a capture or to quickly analyze the results of one. It is fairly straightforward to use; typically all it takes is specifying an interface, deciding whether or not to write the capture to a file, and optionally specifying capture filters.

tshark is a lot like tcpdump, in that it's a CLI-based packet capture utility. However, it has many of Wireshark's expansive display filtering options available as well. An example of this is seen in the hping3 command output above, as we used tshark to validate whether hping3 was truly sending the DSCP value we expected. Note the syntax of the tshark command used:
  • "-c1" means stop after one packet
  • "-i wlan0" specifies wlan0 as the capture interface
  • "-f 'host 8.8.8.8'" specifies the tcpdump-syntax capture filter to use
  • "-T fields -e ip.dsfield.dscp" tells tshark to output using only specified fields, of which only the DSCP field is specified.
Since we're discussing that command output, note that we piped the output of tshark into awk in order to convert the hexadecimal output from tshark into a decimal number. And hence, we see the immediate value of tshark in certain scenarios; by having much more granular control over the output, tshark makes exporting certain pieces of information from a large capture file much easier.

netcat

It would be hard to call any networking toolkit complete without netcat. Again, like all other tools on this list, there are plenty of in-depth tutorials out there. I highly recommend taking some time to learn this tool, as it is quite handy in connectivity testing. netcat can be used to either initiate or listen for incoming TCP or UDP connections. On the surface, this alone makes netcat pretty useful. Previous posts have shown the use of netcat as a UDP listener for syslogs, although the command has not been explained.

To start netcat as a client initiating a connection, just specify the IP address and port of the server. If you're planning to use it in an interactive mode, I'd recommend throwing the -v or even -vv flags to increase verbosity of the output. To start as a listener, you'll want to specify the -l flag for listening and the -p flag to specify the listening port. Outside of these basics, netcat can become a much more flexible tool. Since it sets up the framework for basic network connectivity, it can be used as a conduit for transferring files, passing a bash shell, etc. While there are plenty of tools that can meet these needs, netcat can often fit the bill. In the example below, we'll use netcat on one machine without tshark to send a packet capture to another machine that has tshark.

// read pcap contents into netcat listen server
blogger@bloggy:~# nc -lp 3333 < file.pcap

// on second box, run netcat to connect to bloggy
// pipe output to tshark to see results of analysis
blogger@blogged:~# nc bloggy 3333 | tshark -r - -T fields -e ip.src -e udp.srcport -e ip.dst -e udp.dstport -e frame.time_relative

10.21.177.12 55306 10.21.177.223 69 0.000000000
10.21.177.223 39720 10.21.177.12 55306 0.000756000
10.21.177.12 55306 10.21.177.223 39720 0.004289000
10.21.177.12 2834 10.21.177.223 69 38.648645000
10.21.177.223 36855 10.21.177.12 2834 38.649119000
10.21.177.12 2834 10.21.177.223 36855 38.649273000
10.21.177.223 36855 10.21.177.12 2834 38.649354000
10.21.177.12 2834 10.21.177.223 36855 38.649692000
10.21.177.223 36855 10.21.177.12 2834 38.649747000
10.21.177.12 2834 10.21.177.223 36855 38.650081000
10.21.177.223 36855 10.21.177.12 2834 38.650128000
10.21.177.12 2834 10.21.177.223 36855 38.650458000
10.21.177.223 36855 10.21.177.12 2834 38.650511000


tcpreplay / Bit-Twist

The next set of tools are pretty unique and very good for network testing. Both tcpreplay and Bit-Twist allow you to place the contents of a packet capture back on the wire. While this is not something that is useful on a regular basis, when it's needed, it's very helpful. Personally I've used these tools most often as a method to test either firewall policies or quality of service policies.

QoS policy testing is actually a pretty neat scenario to discuss. Most of the time when building a QoS policy, it's somewhat hard to test the functionality prior to putting production traffic across it. Aside from marking traffic with DSCP values and shifting the trust boundary, it's not necessarily trivial. However, with Bit-Twist or tcpreplay, you can replay previously-captured traffic that includes known quantities of traffic types that match certain QoS queues. This means that if you have class-maps that match based on IP ACL matches, NBAR inspection, etc, you can actually test these features with "real" traffic.

tcpreplay, in its default mode, will replay a pcap file back onto the network with the same headers and payload as it was captured, at the same rate at which it was captured. Note that contrary to its name, it will have no problem replaying other protocols aside from TCP. There are options to increase speed and modify headers if necessary, as well.

Bit-Twist also allows for modification and replaying of packet captures, with a pretty heavy focus on the rewrite capabilities. I have used both Bit-Twist and tcpreplay several times over the years, and they both work well. I can't say that I would necessarily recommend one over the other.

Kismet and Aircrack-ng

Between these two software packages, you can gain a very good understanding of all things 802.11! Kismet is a passive wireless network scanner/sniffer that does a great job of mapping out the wireless environment. It will detect and report wireless access points, wireless clients, SSID's, etc. It will even write out packet capture files of all the data it collects! Since most wireless cards in Linux can be placed in monitor mode, these captures will include the 802.11 control frames that would normally be invisible to a Windows wireless packet capture.

Aircrack-ng is a software suite that provides a number of tools that can be used to test the security of a wireless network. I won't go too deep into this software package, but I feel that it provides several good functions for a network admin. First, it has some basic wireless scanning functions, with control/filtering over the scanning parameters. Furthermore, especially in a lab environment, it can provide a great platform for learning more about the way wireless authentication, authorization, and encryption work.

Conclusion

I am sure there are several more tools that went without mention, but hopefully this list sparks some interest. The only other 'toolset' that I really wanted to at least mention - is the Linux command line environment in general. The use of bash with programs like grep, awk, sed, and xargs allow for quick and simple scripting. The more I learn, the more I continue to find myself in bash to perform data manipulation.

Finally, it should go without saying, but many of these tools can impact your production network! Please use caution when experimenting.

Sunday, June 30, 2013

Lab: Policy-based routing (Part 2)

Introduction

This lab is the continuation of our previous PBR lab, so if you haven't read through it yet, please read Lab: Policy-based routing (Part 1) first. In Part 1, we took our OSPF topology and forced our "VoIP" traffic across a serial link, leaving the rest of the traffic to use our "point to point wireless" link. In order to accomplish this, we used a concept called Policy-Based Routing (PBR). This allowed us to cherry-pick traffic and force it across a link other than what our OSPF-driven routing table would have chosen.

In today's lab, we'll take it a few steps further. To review the reasons one may choose to avoid PBR, there are some definite support issues. Since we are manually forcing traffic to choose a different path than what the routing table would normally choose, we are opening ourselves up to minor issues like inefficient routing, but in worst-case scenarios, an increased likelihood of routing loops. Furthermore, when PBR is implemented, the normal diagnostic commands we are all used to (show ip route, for instance) become potentially inaccurate. Remember that PBR is in effect intercepting the routing decision from the routing table, so it can often be tricky to understand what routing is taking place. Due to this, it can also be hard to track where traffic is actually going.

We'll tackle some of these issues as we walk through the following sections:
  • Gain better visibility of traffic flow throughout our topology
  • Test failure scenarios to see how PBR reacts
  • Configure PBR to behave more dynamically
  • Re-test any failure scenarios that previously did not recover
Please reference the Part 1 lab mentioned at the top of this post for configurations up to this point. With any luck, by the time we are done with part 2, we'll have a better understanding of how we can make our PBR topology a bit more resilient and transparent.

Gaining visibility

In the previous lab, we used commands like 'show ip policy' and 'show route-map' to see what traffic would be policy-routed. However, if the policy routing is applied to multiple interfaces (as is the case with Wan2 router), the packet counts on the 'show route-map' output can quickly become useless.


One way to gain better visibility is to take advantage of a feature set IOS already has for efficient identification of traffic: Quality of Service (QoS) configuration. While this lab will not dive into QoS, we will take advantage of the QoS command framework in IOS to at least give us some better visibility into the traffic on our network. In a real-world scenario, we would likely want to use QoS anyway to help protect important traffic, so the visibility would be in place anyway.

We'll add the following configuration to all interfaces Wan1, Wan2, and Branch1 to best understand where traffic is entering and exiting these routers. Since these are the three routers where we have PBR implemented, it makes the most sense to try to better view this traffic. The example below shows adding the QoS class-map and policy-map to the router's global config, and then applying it to one particular interface. Remember, we'll be applying this interface-level configuration to each non-loopback ip-addressed interface on Wan1, Wan2, and Branch1. Also, there is an additional access-list being added, and it is representative of the opposite direction VoIP traffic than what was defined for PBR. The example below would be for the Wan1 or Wan2 side.

!
ip access-list extended VoIP-Incoming
 10 permit udp 172.16.10.0 0.0.0.255 any range 16384 32768
!
class-map match-any PBR-VoIP
 match access-group name PBR-VoIP-to-T1-ACL
 match access-group name VoIP-Incoming
!
policy-map PBR-Counters
 class PBR-VoIP
 class class-default
!
interface FastEthernet2/0
 service-policy input PBR-Counters
 service-policy output PBR-Counters
!


Now that we have this configuration in place, we can go ahead and test the generation of "VoIP" traffic from Core1, and see where it goes. Remember that for sake of simplicity in the lab environment, we tricked IOS syslog into sending its logs to the TestPC (172.16.10.10) off Branch1, and sending the UDP syslogs on a nonstandard port that fit into the VoIP RTP range. Below, witness the "VoIP" traffic being generated by Core1 and received by the Test PC:

! ------ Initiate Traffic from Core1 ------
!
Core1#send log Testing PBR with all links up
Core1#
*Jun 18 13:30:19.047: %SYS-2-LOGMSG: Message from 0(): Testing PBR with all links up
Core1#
!
!
! ------ See Traffic at TestPC ------
!
root@bt:~# nc -luvvnp 16390
listening on [any] 16390 ...
connect to [172.16.10.10] from (UNKNOWN) [10.0.11.4] 57473
<186>25: *Jun 18 13:30:19.047: %SYS-2-LOGMSG: Message from 0(): Testing PBR with all links up


Now that we have seen that the traffic successfully routes (no surprise) - it's time to confirm that the path is as expected. Note that nothing has really changed since the end of Part 1, so all we are accomplishing is validating the path with a different set of commands. The following output will show the successive "show policy-map interface" command issued on each router. Note that the output is being filtered as the IOS command prompt to make it a bit more legible, and I've also cut a bit of the output for completely irrelevant interfaces. The output is a bit long, so pay attention to the color highlighting to help pick out the important pieces.

! ------ Check QoS Stats on Wan1 ------
!
Wan1#show policy-map interface | inc /|Service|Class|,
 GigabitEthernet1/0 
  Service-policy input: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      1 packets, 135 bytes
        1 packets, 135 bytes
    Class-map: class-default (match-any)
      491 packets, 46354 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      1021 packets, 101657 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
 FastEthernet2/0 
  Service-policy input: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      478 packets, 45100 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      1004 packets, 96847 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
 GigabitEthernet3/0.1 
  Service-policy input: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      495 packets, 46794 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      1 packets, 135 bytes
        1 packets, 135 bytes
    Class-map: class-default (match-any)
      569 packets, 73868 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
Wan1#
!
!
! ------ Check QoS Stats on Wan2 ------
!
Wan2#show policy-map interface | inc /|Service|Class|,
 GigabitEthernet1/0 
  Service-policy input: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      512 packets, 48548 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      1068 packets, 94670 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
 Serial2/0 
  Service-policy input: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      498 packets, 42084 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      1 packets, 125 bytes
        1 packets, 125 bytes
    Class-map: class-default (match-any)
      1043 packets, 74844 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
 GigabitEthernet3/0.1 
  Service-policy input: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      1 packets, 135 bytes
        1 packets, 135 bytes
    Class-map: class-default (match-any)
      516 packets, 48976 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      596 packets, 70163 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
Wan2#
!
!
! ------ Check QoS Stats on Branch1 ------
!
Branch1#show policy-map interface | inc /|Service|Class|,
 FastEthernet1/0 
  Service-policy input: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      811 packets, 76518 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      1702 packets, 148025 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
 Serial2/0 
  Service-policy input: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      1 packets, 125 bytes
        0 packets, 0 bytes
        1 packets, 125 bytes
    Class-map: class-default (match-any)
      811 packets, 68473 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      1686 packets, 115744 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
 GigabitEthernet3/0 
  Service-policy input: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      1 packets, 43 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      1 packets, 135 bytes
        0 packets, 0 bytes
        1 packets, 135 bytes
    Class-map: class-default (match-any)
      1692 packets, 145181 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
Branch1#


By following the blue-colored output, it is apparent which paths were used and which were not used for the "VoIP" traffic. As expected, the path matched the last section of Part 1. It followed OSPF from Core1 to Wan1, at which point it matched our PBR policy and instead of following OSPF directly to Branch1, it routed across to Wan2. Again, due to PBR on inbound at Wan2, the packet ignored the routing table and routed across the T1 to Branch1.

Failure Scenario: T1 Link Down

Now that we've tested packet traversal while all links are up and working fine, we need to test some failure scenarios to see if our traffic still routes. To create our first and most obvious failure, we will 'shut' the T1 interface from the Branch1 side, simulating loss of link on the T1 line. This is probably the best type of failure to happen, as both Wan2 and Branch1 lose the link from a physical standpoint. There are plenty of topologies where one side could fail and the other stays up. An example of this would be two routers connected via a switch. However, in this case, we luck out because a link failure will be seen immediately on both ends.

As you will soon find out, PBR during failures can be, at its best, a tricky situation to follow. Because of this, I'll walk through this one a bit more step-by-step. First, we will break the T1 connection at the "Branch1" side.

With this link down, we will next generate our test packet and then observe the results. Following the same process, we will start a listener on the Test PC at IP 172.16.10.10 hanging off the Branch1 router. Then, we will generate a syslog message from Core1 to act as our VoIP packet.

! ------ Start listener on Test PC ------
!
root@bt:~# nc -luvvnp 16390
listening on [any] 16390 ...
!
!
! ------ Generate Syslog on Core1 ------
!
Core1#send log Testing PBR with T1 in 'down' state
Core1#
*Jun 30 09:43:20.199: %SYS-2-LOGMSG: Message from 0(): Testing PBR with T1 in 'down' state
Core1#
!
!
! ------ Observe Syslog on Test PC ------
!
root@bt:~# nc -luvvnp 16390
listening on [any] 16390 ...
connect to [172.16.10.10] from (UNKNOWN) [10.0.11.4] 51054
<186>22: *Jun 30 09:43:20.199: %SYS-2-LOGMSG: Message from 0(): Testing PBR with T1 in 'down' state


Without any further analysis, we already know the most important part -- the traffic made it to its destination! This means that the policy routing is not black-holing traffic with this failure scenario. We should take a few minutes to observe the traffic path and understand why. Core1 will send its traffic to Wan1, as this is the OSPF best route, and there is no PBR applied locally on that router. So, we will start our tracing on Wan1. Traffic is received on Wan1's Gi1/0 interface, which has PBR inbound policy set. That policy dictates that traffic will be sent to Wan2 via the G3/0.1 interface. Now, we know this is probably a bad idea since the T1 interface is down on Wan2. As Wan1 has no way of knowing this, PBR will continue as expected. Observe the counters on Wan1 below:

Wan1#show policy-map interface | inc /|Service|Class|,
 GigabitEthernet1/0 
  Service-policy input: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      1 packets, 141 bytes
        1 packets, 141 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      327 packets, 31398 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
...
 GigabitEthernet3/0.1 
...
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      1 packets, 141 bytes
        1 packets, 141 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      385 packets, 47634 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
...


Notice that the VoIP packet came in on Gi1/0 (from Core1) as expected, and the policy-routing forces the packet out G3/0.1 towards Wan2. This is observed in the output direction in the above output. So now, we have observed PBR still working 'as expected' and sending the packet towards Wan2, even though Wan2 no longer has the T1 link up.

It will be interesting to see what Wan2 does with the packet next. Keep in mind that Wan2 has a policy-map on its inbound interface from Wan1, stating that VoIP traffic coming in from Wan1 will be policy-routed out the T1. It is important to understand, though, that PBR will only execute if the next-hop interface is up/up. If this criteria is not met, PBR will instead step away and the packet will route according to the IP routing table on the router. This is critical to understand, as the T1 "next hop" is currently down, we should expect PBR to have no effect on Wan2's routing decision. Therefore, we should see the packet follow OSPF's selected route right back to Wan1:

Wan2#show policy-map interface | inc /|Service|Class|,
...
 Serial2/0 
...
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      650 packets, 46763 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
 GigabitEthernet3/0.1 
  Service-policy input: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      1 packets, 141 bytes
        1 packets, 141 bytes
    Class-map: class-default (match-any)
      340 packets, 32396 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
...
 GigabitEthernet3/0.5 
...
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      1 packets, 141 bytes
        1 packets, 141 bytes
    Class-map: class-default (match-any)
      337 packets, 25718 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
...


Pay careful attention to the input/output directions of the above output. Notice that, as expected from seeing Wan1's output, Wan2 shows an ingress VoIP packet on Gi3/0.1. Since Serial2/0 is in a down state, our theory proves correct that no VoIP packet is seen egressing that interface. However, we can observe that the VoIP packet leaves Wan2, destined back to Wan1 on interface Gi3/0.5.

That is fairly interesting to observe; the packet came in on the area 0 interface, but left on the area 5 interface. Why? On the way from Wan1 to Wan2, the area 0 interface was hard-set by the policy routing. On the way back from Wan2 to Wan1, since PBR was no longer valid, Wan2 followed its IP routing table. Remember that if OSPF has a choice between going inter-area or staying intra-area to reach a destination, it will always choose intra-area. Therefore, since Wan2 and Wan1 both shared an interface in area 5 (the destination area) this is the link that is chosen.

Following our packet, Wan2's IP routing table has led us back to Wan1. Let's check stats to see if we can see where it came in and left again. For those following along and thinking, "Hey, wouldn't I have seen this the first time we checked Wan1?" The answer is yes; this is one of the reasons I chose to omit portions of the output. Otherwise it can be misleading and confusing. Like I said, policy routing can easily lead to confusion! We'll pick up that same show command in its entirety this time, and I'll selectively highlight the portions related to this last leg of the packet's journey to Branch1. Fair warning, there's a lot of output below.

Wan1#show policy-map interface | inc /|Service|Class|,
 GigabitEthernet1/0 
  Service-policy input: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      1 packets, 141 bytes
        1 packets, 141 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      327 packets, 31398 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      685 packets, 65100 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
 FastEthernet2/0 
  Service-policy input: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      325 packets, 30978 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      1 packets, 141 bytes
        1 packets, 141 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      679 packets, 64857 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
 GigabitEthernet3/0.1 
  Service-policy input: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      328 packets, 31360 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      1 packets, 141 bytes
        1 packets, 141 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      385 packets, 47634 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
 GigabitEthernet3/0.5 
  Service-policy input: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      1 packets, 145 bytes
        1 packets, 145 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      325 packets, 30882 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      329 packets, 26834 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
Wan1#


Through all of that, notice the lines colored in blue. We see the same VoIP packet come back to Wan1 for the second time, this time from Wan2. Because there is no policy-map for inbound traffic on Gi3/0.5, Wan1 now forwards this packet via its IP routing table, choosing Fa2/0 for its connection to Branch1. As a memory refresher, this is the point-to-point transparent wireless bridge in our lab scenario. While there was definitely sub-optimal routing during this failure scenario, the packet still arrives where it's supposed to. So, with this basic type of failure, our policy-based routing seems to have survived.

Failure Scenario: Semi-failed Interface

This type of failure scenario could be caused by several issues, so note that our simulation efforts are only to mimic the typical symptoms. The generic issue we are tackling in this failure scenario is when a router still shows a link as up/up, but the router on the other end is in a 'down' state of some sort. One of the common examples of this is when two routers form a layer-3 relationship across a layer-2 switch. This means that the switch could drop its link to router A, but leave router B's link up. Therefore, router B still believes it has an interface up to send to a presumed-listening router A.

These partial failures are no issue for dynamic routing protocols to handle; if a neighbor becomes unresponsive for any reason, the neighborship is torn down after a time period and other routes are used. When dealing with static routes and policy-based routing, there is no dynamic protocol to manage a neighbor relationship. This means that we are susceptible to these types of failures. To demonstrate, I'll put a deny any any access-list ingress on Branch1's serial interface. This will keep the interface up, but traffic will not be able to pass from Wan2 to Branch1. Note that in order to more effectively troubleshoot these types of issues, we'll want to make the following configuration change to each router:

router ospf 1
 log-adjacency-changes detail


This change will allow the routers to log all OSPF state changes. Since the ACL is messing with one-way communication, we will see a different impact than the typical "dead timer expired" behavior. When the ACL is applied inbound on Branch1's serial interface, Wan2's OSPF hellos stop reaching Branch1. However, Branch1's hellos are still received by Wan2. When Branch1's dead timer expires for Wan2, it removed Wan2 from its adjacency table. Now, when Branch1 advertises its hello towards Wan2, Wan2 sees that it is no longer in Branch1's neighbor list and moves the neighbor state to Init. This is seen below:

! ------ Create and Apply ACL on Branch1 ------
!
Branch1#config t
Enter configuration commands, one per line.  End with CNTL/Z.
Branch1(config)#ip access-list extended DenyAll
Branch1(config-ext-nacl)#10 deny ip any any
Branch1(config-ext-nacl)#exit
Branch1(config)#int s2/0
Branch1(config-if)#ip access-group DenyAll in 
Branch1(config-if)#end
Branch1#
*Jun 30 19:38:49.073: %SYS-5-CONFIG_I: Configured from console by console
Branch1#
!
!
! ------ After dead timer expires... ------
!
Branch1#
*Jun 30 19:39:15.741: %OSPF-5-ADJCHG: Process 1, Nbr 172.16.255.2 on Serial2/0 from FULL to DOWN, Neighbor Down: Dead timer expired
Branch1#
!
!
! ------ Wan2 moves Branch1 to Init ------
!
Wan2#
*Jun 30 19:39:15.561: %OSPF-5-ADJCHG: Process 1, Nbr 172.16.255.3 on Serial2/0 from FULL to INIT, 1-Way
Wan2#


Now that we're deep in our semi-failed state, it is time to revisit PBR to see how it is handling this situation. We'll redo our test VoIP packet and see where it ends up:

! ------ Start listener on Test PC ------
!
root@bt:~# nc -luvvnp 16390
listening on [any] 16390 ...
!
!
! ------ Generate Syslog on Core1 ------
!
Core1#send log Testing PBR with T1 in 'semi-failed' state
Core1#
*Jun 30 20:50:22.665: %SYS-2-LOGMSG: Message from 0(): Testing PBR with T1 in 'semi-failed' state
Core1#
!
!
! ------ Observe Syslog on Test PC ------
!
root@bt:~# nc -luvvnp 16390
listening on [any] 16390 ...


As we can observe, there is likely a routing problem here, as evidenced by the fact that the packet never made it to its destination. Knowing the behavior from the previous scenario, we can assume that the VoIP packet will be policy-routed towards Wan2 on the Gi3/0.1 interface. Let's start there in our troubleshooting.

Wan2#show policy-map interface | inc /|Service|Class|,
...
 Serial2/0
  Service-policy input: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      139 packets, 11120 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      1 packets, 138 bytes
        1 packets, 138 bytes
    Class-map: class-default (match-any)
      429 packets, 31256 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
 GigabitEthernet3/0.1
  Service-policy input: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      1 packets, 148 bytes
        1 packets, 148 bytes
    Class-map: class-default (match-any)
      143 packets, 13614 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      164 packets, 19002 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
 GigabitEthernet3/0.5
  Service-policy input: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      143 packets, 13502 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      141 packets, 10651 bytes
      5 minute offered rate 0 bps, drop rate 0 bps


Again, note the color-highlighted sections. We can observe that 1 VoIP packet came ingress to Wan2 from Wan1's Gi3/0.1 interface (as expected due to Wan1's policy routing). It is also apparent that the packet left out the serial interface, as opposed to hairpinning back to Wan1 as in the previous link failure scenario. Because it left Wan2 out the T1, the packet obviously met its fate in the form of the DenyAll ACL applied at Branch1. This explains why the packet never made it to the Test PC.

Again, I want to stress that while we are mimicking this behavior by using an ingress ACL on Branch1, this is not the particular scenario we are really worried about. But it does go to prove the point that some partial failure in communication can degrade the ability for two routers to talk, even though a physical interface stays up/up. This example shows a setback in using policy-based routing for a decision point.

Make PBR More Dynamic

In order to make PBR a bit more dynamic, we are going to take a bit difference approach than what is normally used for this. If you look for common ways to make static routing or policy-based routing more dynamic, there are plenty of scenarios that show how to set up IP SLA tracking. This works fine and can definitely do the job. Another method that can solve this problem is Bidirectional Forwarding Detection (BFD), which I may cover at a higher level in a different post. In this post, however, I'd like to introduce another method.

We already have a pretty high confidence level of the neighbor's health because we're running OSPF. We rely on this neighbor state and dynamic route calculation for most all of our routing, so why not piggy-back on it for our PBR? We can take advantage of EEM scripting to allow us to do this.

! ------ Script for when Neighbor drops  ------

event manager applet PBR-Down 
 event syslog pattern ".*%OSPF-5-ADJCHG: Process 1, Nbr 172.16.255.3 on Serial2/0 from FULL to.*"
 action 1.0  cli command "enable"
 action 10.0 syslog msg "PBR Removed from Interfaces due to OSPF Nei State S2/0"
 action 2.0  cli command "config t"
 action 3.0  cli command "interface Gig1/0"
 action 4.0  cli command "no ip policy route-map PBR-VoIP-to-T1"
 action 5.0  cli command "interface G3/0.1"
 action 6.0  cli command "no ip policy route-map PBR-VoIP-to-T1"
 action 7.0  cli command "interface G3/0.5"
 action 8.0  cli command "no ip policy route-map PBR-VoIP-to-T1"
 action 9.0  cli command "end"
!
!
! ------ Script for when Neighbor returns  ------

event manager applet PBR-Up 
 event syslog pattern ".*%OSPF-5-ADJCHG: Process 1, Nbr 172.16.255.3 on Serial2/0 from LOADING to FULL, Loading Done$"
 action 1.0  cli command "enable"
 action 10.0 syslog msg "PBR Enabled on Interfaces due to OSPF Nei State Full S2/0"
 action 2.0  cli command "config t"
 action 3.0  cli command "interface Gig1/0"
 action 4.0  cli command "ip policy route-map PBR-VoIP-to-T1"
 action 5.0  cli command "interface Gig3/0.1"
 action 6.0  cli command "ip policy route-map PBR-VoIP-to-T1"
 action 7.0  cli command "interface Gig3/0.5"
 action 8.0  cli command "ip policy route-map PBR-VoIP-to-T1"
 action 9.0  cli command "end"
!


By configuring the above EEM scripts on Wan2, we should now be relying on OSPF neighbor state to control the enforcement of our PBR. If this works, of course we would want to implement a similar configuration on Branch1 to avoid the reverse situation. In any case, it seems as though we are ready to test.

Retrying Failures with Dynamic PBR

Since the partial failure scenario blew a giant hole in our PBR, we should test it again now that we have tried to make PBR a bit smarter. Let's start by implementing the DenyAll ACL in Branch1's T1 interface, in the inbound direction. Note the difference in behavior on Wan2 this time:

! ------ "Break" the T1 at Branch1  ------

Branch1#config t
Enter configuration commands, one per line.  End with CNTL/Z.
Branch1(config)#int s2/0
Branch1(config-if)#ip access-group DenyAll in
Branch1(config-if)#end
Branch1#
*Jun 30 22:23:55.872: %SYS-5-CONFIG_I: Configured from console by console
Branch1#
*Jun 30 22:24:25.772: %OSPF-5-ADJCHG: Process 1, Nbr 172.16.255.2 on Serial2/0 from FULL to DOWN, Neighbor Down: Dead timer expired
Branch1#
!
!
! ------ Observe our EEM script at Wan2  ------

Wan2#
*Jun 30 22:24:27.800: %OSPF-5-ADJCHG: Process 1, Nbr 172.16.255.3 on Serial2/0 from FULL to INIT, 1-Way
*Jun 30 22:24:27.876: %HA_EM-6-LOG: PBR-Down: PBR Removed from Interfaces due to OSPF Nei State S2/0
Wan2#
*Jun 30 22:24:28.052: %SYS-5-CONFIG_I: Configured from console by  on vty0 (EEM:PBR-Down)
Wan2#


Now that we've successfully broken the T1 again, let's set up our Test PC to listen and generate a "VoIP" test packet from Core1.

! ------ Start listener on Test PC ------
!
root@bt:~# nc -luvvnp 16390
listening on [any] 16390 ...
!
!
! ------ Generate Syslog on Core1 ------
!
Core1#send log Testing 'Smart PBR' with T1 in 'semi-failed' state
Core1#
*Jun 30 22:29:44.792: %SYS-2-LOGMSG: Message from 0(): Testing 'Smart PBR' with T1 in 'semi-failed' state
Core1#
!
!
! ------ Observe Syslog on Test PC ------
!
root@bt:~# nc -luvvnp 16390
listening on [any] 16390 ...
connect to [172.16.10.10] from (UNKNOWN) [10.0.11.4] 51054
<186>24: *Jun 30 22:29:44.792: %SYS-2-LOGMSG: Message from 0(): Testing 'Smart PBR' with T1 in 'semi-failed' state


Alright, that's a good sign! By using EEM to automatically remove the PBR policy when the "target" router loses OSPF neighborship, we have successfully forced Wan2 to hairpin the VoIP traffic right back to Wan1 (just like in the link-failure scenario). Again, the routing is sub-optimal, but it will get the job done. See the output below for confirmation.

Wan2#show policy-map interface | inc /|Service|Class|,
...
 GigabitEthernet3/0.1 
  Service-policy input: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      1 packets, 156 bytes
        1 packets, 156 bytes
    Class-map: class-default (match-any)
      95 packets, 8930 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      109 packets, 12677 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
 GigabitEthernet3/0.5 
  Service-policy input: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      97 packets, 9154 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      1 packets, 160 bytes
        1 packets, 160 bytes
    Class-map: class-default (match-any)
      96 packets, 7688 bytes
      5 minute offered rate 0 bps, drop rate 0 bps


The last thing to check with this before calling it a day is to make sure that once we restore connectivity, the PBR kicks back in and Wan2 sends the packet out its serial interface. In the output below, we'll remove the ACL on Branch1, and then watch EEM restore our PBR on Wan2:

! ------ Fix the T1  ------

Branch1#config t
Enter configuration commands, one per line.  End with CNTL/Z.
Branch1(config)#int s2/0
Branch1(config-if)#no ip access-group DenyAll in
Branch1(config-if)#end
Branch1#
*Jun 30 22:57:26.344: %SYS-5-CONFIG_I: Configured from console by console
Branch1#
*Jun 30 22:57:29.108: %OSPF-5-ADJCHG: Process 1, Nbr 172.16.255.2 on Serial2/0 from LOADING to FULL, Loading Done
Branch1#
!
!
! ------ Watch EEM turn on PBR  ------

Wan2#
*Jun 30 22:57:28.876: %OSPF-5-ADJCHG: Process 1, Nbr 172.16.255.3 on Serial2/0 from INIT to 2WAY, 2-Way Received
*Jun 30 22:57:28.876: %OSPF-5-ADJCHG: Process 1, Nbr 172.16.255.3 on Serial2/0 from 2WAY to EXSTART, AdjOK?
*Jun 30 22:57:28.880: %OSPF-5-ADJCHG: Process 1, Nbr 172.16.255.3 on Serial2/0 from EXSTART to EXCHANGE, Negotiation Done
*Jun 30 22:57:28.896: %OSPF-5-ADJCHG: Process 1, Nbr 172.16.255.3 on Serial2/0 from EXCHANGE to LOADING, Exchange Done
*Jun 30 22:57:28.896: %OSPF-5-ADJCHG: Process 1, Nbr 172.16.255.3 on Serial2/0 from LOADING to FULL, Loading Done
*Jun 30 22:57:28.956: %HA_EM-6-LOG: PBR-Up: PBR Enabled on Interfaces due to OSPF Nei State Full S2/0
Wan2#
*Jun 30 22:57:29.144: %SYS-5-CONFIG_I: Configured from console by  on vty0 (EEM:PBR-Up)
Wan2#
Wan2#clear counters
Clear "show interface" counters on all interfaces [confirm]
Wan2#
*Jun 30 22:58:38.944: %CLEAR-5-COUNTERS: Clear counter on all interfaces by console
Wan2#


Finally, we can generate our "VoIP" packet once again from Core1 and ensure it not only still arrives to the Test PC in Branch1, but arrives via the T1 link. The output below will walk us through it.

! ------ Initiate Traffic from Core1  ------

Core1#send log Testing 'Smart PBR' with T1 in 'semi-failed' state
Core1#
*Jun 30 22:29:44.792: %SYS-2-LOGMSG: Message from 0(): Testing 'Smart PBR' with T1 in 'semi-failed' state
Core1#send log Testing 'Smart PBR' with T1 after 'semi-failed' state
Core1#
*Jun 30 23:01:00.364: %SYS-2-LOGMSG: Message from 0(): Testing 'Smart PBR' with T1 after 'semi-failed' state
Core1#
!
!
! ------ Observe Syslog on Test PC  ------

root@bt:~# nc -luvvnp 16390
listening on [any] 16390 ...
connect to [172.16.10.10] from (UNKNOWN) [10.0.11.4] 51054
<186>25: *Jun 30 23:01:00.364: %SYS-2-LOGMSG: Message from 0(): Testing 'Smart PBR' with T1 after 'semi-failed' state
!
!
! ------ View Wan2 Policy-map Counts  ------

Wan2#show policy-map interface | inc /|Service|Class|,
...
 Serial2/0 
  Service-policy input: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      143 packets, 12012 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      1 packets, 149 bytes
        1 packets, 149 bytes
    Class-map: class-default (match-any)
      300 packets, 20166 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
 GigabitEthernet3/0.1 
  Service-policy input: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      1 packets, 159 bytes
        1 packets, 159 bytes
    Class-map: class-default (match-any)
      143 packets, 13442 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
  Service-policy output: PBR-Counters
    Class-map: PBR-VoIP (match-any)
      0 packets, 0 bytes
        0 packets, 0 bytes
    Class-map: class-default (match-any)
      167 packets, 19082 bytes
      5 minute offered rate 0 bps, drop rate 0 bps
Wan2#


As seen above, when the OSPF neighborship came back, PBR came back on. This allowed our VoIP traffic to continue along the T1 as we had wanted.

Conclusion

This has been a long, drawn-out lab, so thanks for sticking with it. I'll take a moment to repeat my sentiments on policy-based routing: it's ugly, hard to troubleshoot, and can get out of hand quick. However, it can be a necessity in certain scenarios. It should also be noted that the above lab does not solve every issue; especially as it essentially ignored the Branch1 PBR configuration. Also, there are other potential routing loops we did not address. Regardless, this lab should serve as a decent primer as for why and how to implement policy-based routing on Cisco routers.