As resource demands and bandwidth speeds in many of today’s network infrastructures continue to increase, many network administrators believe that NetFlow sampling is the only way to deal with the high flow volume that is sent across the network. In fact, setting a NetFlow sample rate of 1 in 100 can cut flow volumes as much as 50%.
Many high end network equipment vendors like Cisco and Juniper may force you to sample, or at the very least strongly recommend it.
This all sounds real nice, but it is hardly the answer. When you sample flows, you only see a percentage of the traffic and you are limiting visibility. Sampling technology simply doesn’t provide the full story. It’s like only reading every 100th word in a novel. At the end of the story you have a vague outline of what happened and who the characters were, but little more. Security analysts, who use flows for network security forensics and incident response, don’t want 1 out of every 100 packets, they want, and need, one out of every one.
Let’s take a look at a way that can help you avoid NetFlow sampling.
By changing the flow aggregation method on the NetFlow cache table you can cut down flow volume, and still get 100% traffic accountability.
We often do this with customer’s NetFlow configurations on the Catalyst 6500 platform by using different options available on the MLS FLOW command.
The problem is that as the aggregation method widens, you begin to lose individual flow, and application visibility.
Enter Flexible NetFlow
Traditional NetFlow aggregates flows based on a standard tuple – Source IP, Destination IP, Source port, Destination port, Protocol, and input and output interfaces. In other words, all these elements must match to aggregate or a new flow record is created.
Flexible NetFlow lets you create your own flow records that can contain specific key fields that govern the aggregation scheme. The problem, just like what we see on the 6500, is that you lose that individual flow visibility the wider the key field mask is.
But there is a pretty cool way to get around this. Export NBAR application information.
NBAR is a process by which the router looks inside the packets and identifies applications by their layer 7 names. So setting up the aggregation method to use the NBAR application name rather than source or destination port references can dramatically cut down on flows, without losing any of the true traffic statistics.
The reduction in flows is largely due to the fact that we are tracking application conversations instead of connections when only using NBAR. The different key fields mean different aggregation in the flow cache.
Example: If there were 2 SSL connections to the same web server in the same minute, the NBAR template will generate 1 flow record because both flows will match the same key fields. The one with Source and Destination ports as key fields will create 2 flow records as long as the connections have different ephemeral ports to make the connection.
Take a look at the drop in flow volumes from the exact same traffic streams where the 2 different aggregation methods are deployed.
This is how each each flow record was set up. Notice that in the first NetFlow record we are aggregating the flow records on match statements that represent a pretty standard tuple. Anytime a flow contains a different source or destination port, a new flow record entry is added to the cache table.
flow record port-test-record match ipv4 protocol match ipv4 source address match ipv4 destination address match transport source-port match transport destination-port match interface input match interface output collect transport tcp flags collect counter bytes collect counter packets collect timestamp sys-uptime first collect timestamp sys-uptime last
The NetFlow record below will yield the same traffic monitoring numbers, but because we are matching on the NBAR application name and not the individual source/destination ports, the number of aggregated cache table entries is considerably less.
flow record nbar-test-record match ipv4 protocol match ipv4 source address match ipv4 source prefix match ipv4 destination address match interface input match interface output match application name collect transport tcp flags collect counter bytes collect counter packets collect timestamp sys-uptime first collect timestamp sys-uptime last
This Flow volume report tells the story
In short, we are changing how the aggregation works in the flow cache. This does eliminate the ability to look at each connection in a conversation, but it may be worth it in cases where people are less interested in doing that and need full bandwidth accounting.
If you are currently sampling NetFlow and would like better network traffic accounting, we can help you with your NetFlow configuration.