I really should have written this a long time ago, but I guess sometimes inspiration is only realized when a current need slaps you in the face…
What is sFlow and how does it work?
sFlow is a sampling technology that was first introduced in 1991 by HP. Now, if there’s only one word that you need to remember in this whole blog post, please make sure it’s the word sampling. If you can remember that, everything following this paragraph will make perfect sense.
My wife is a big fan of Jelly Belly jelly beans. She loves them. So I think I will use this addiction for illustrative purposes.
Imagine you are at the mall (if you are wondering why are you at the mall in the first place, just imagine you were forced to go).
So anyway, you go to the local, over-priced candy shop, and you buy a one pound bag of assorted jelly beans. Within this one pound bag of assorted flavors, there is a total of 300 jelly beans.
When you setup your switch for sFlow, there are two portions you have to configure. The first being the polling interval, the second being the sample rate.
Polling interval counts the jelly beans in the bag
Now the polling interval functions as the counter for a small block of time. If you set the polling interval for 60 seconds, the switch is counting all of the packets that have gone through that interface in the past 60 seconds, and then exports that count. So when your switch exports these flows to your collector, it is saying, “Hey! There are 300 beans in this one pound bag!”
Make sense so far?
Sampling your jelly beans
Okay, now if your household works like mine does… you never actually get to eat the whole bag of jelly beans. I only get to eat about one out of every 50 jelly beans. So when I grab that one jelly bean, it’s the luck of the draw. I get what I get; unless it’s the black licorice type, and I just throw those back and try again.
Me randomly grabbing that one jelly bean is much like that second configuration, which is the sampling rate. With the sampling rate, you are telling the switch to sample one out of every X amount of packets that pass through the interface.
In this illustration, my sampling rate for the jelly bean bag I bought was 1/50. Easy right?
What can you learn from sampling?
Consider this: If my sampling rate is 1/50, I’m only getting six jelly beans out of the full 300. (grumble grumble)
But let me tell you about the six jelly beans I did get.
Out of the six I grabbed, I got (two) cherry flavored, (one) kiwi and (three) buttered popcorn.
Looking at the jelly beans that I did get, what conclusions can you come to?
Judging by samples that I took, can you tell me exactly how many of the 300 jelly beans are black liquorish? No.
Can you tell me exactly how many of the 300 are kiwi flavored? No, you can’t.
However, judging by the fact that out of the six samples that I took, three of them were popcorn, you could speculate that there may be quite a few popcorn jelly beans in that bag. Maybe the majority are popcorn flavored. However, you can never be 100% certain of the full content of that bag, without trying each and every one individually.
…and that is the difference between sFlow and Cisco NetFlow.
With Cisco NetFlow, you know that there are 300 jelly beans in the bag. You also get the luxury of eating them all, so you know exactly what kinds of jelly beans you have.
With sFlow, you will always know how much traffic is being generated, much like you know there are 300 beans in the bag; but since you are only sampling 1/50 of the packets, you will only see 1/50th of the content within those packets. You won’t truly know how much of that traffic is HTTP, SMTP or HTTPS based. However, if a lot of your samples happen to be HTTP traffic much like that buttered popcorn flavor jellybean, then it can give you a hint that there could be a lot of HTTP traffic on that interface.
When using Scrutinizer to monitor your sFlow switch, be sure to remember that your port utilizations are correct. Scrutinizer is aware that there are 300 beans in that bag. Be aware that the statistics regarding Top Hosts, Top Conversations and Top Protocols are all based on that sampled traffic.
You didn’t think you were gonna get to eat all the jelly beans, did you?