Occasionally we hear talk of bidirectional flows, deduplication, flow stitching and sometimes questions about RFC 5103 spring up. Today I’ll outline what these technologies are as well as the good and bad aspects of both. Our NetFlow analyzer supports all of them.
Most NetFlow or IPFIX implementations are unidirectional meaning TCP connections between two hosts results in two flows (i.e. A to B and B to A). This sounds sort of inefficient, but here’s why it works the way it does. Flows are generally only metered inbound or ingress on interfaces and the responding flow is captured on a different interface. Hence two flows.
NOTE: Enabling both ingress and egress on all interfaces of a router results in 4 flows for one TCP bidirectional conversation. Generally only ingress is needed and the reasons to enable egress NetFlow is covered in my other blog.
Bidirectional flows should be implemented according to RFC 5103 where a single flow represents A to B and B to A. Obviously the size of the flow increases by almost double however, it can result in nearly half the volume of flows back to the high volume NetFlow collector. The SonicWALL IPFIX configuration is the only vendor we have seen implement this according to RFC 5103. Bidirectional flows from the Cisco ASA NetFlow export are not RFC 5103 compliant and have generally led to confusion.
Some vendors take matters into their own hands and mend the two flows into one. This process of the collector manually creating one flow from two is called Flow Stitching. This is done to effectively do what RFC 5103 defines. Lancope within StealthWatch is one company that claims to be stitching flows back together:
Their strategy involves stitching all flows whether they will ever be looked at or not and it doesn’t work across collectors which means it only provides partial accuracy. Unlike Stealthwatch, Scrutinizer stitches across collectors while providing a much more scalable architecture. See Scrutinizer Vs Stealthwatch.
Scrutinizer will stitch the flows together during the reporting process to show the flows in both directions:
You can change the report type above and gain even deeper details:
Deduplication or Deduplicated NetFlow
Deduplication is the process of taking a flow that traversed 2 or more routers and saving it as 1 flow with a reference on which routers it was seen on. Because 4 flows are involved with TCP connections, the packets and bytes are averaged. Flow Analytics does this in near real time in order to display the top hosts, applications, protocols etc. across hundreds of routers and switches. Although the benefits are clear, using averaged data is not as reliable for forensics, detailed reporting or NetFlow billing. This is why keeping the original flows is critical.
As you can see below, the DSCP value, bytes and possibly the flags could all be different as the flow traverses the network and because of this, some flows can’t be deduplicated.
Because of the above, a few NetFlow and IPFIX collectors will deduplicate for reasons outlined above and save the original flow data for forensics and detailed reliable NetFlow reporting. Sometimes deduplication is combined with Flow Stitching.
Above you can see the out bytes (obytes) added to the flow during the process of flow stitching. This really isn’t the whole picture because the columns: opackets, oflags, oTos, etc. all need to be added on to the saved flow. Many vendors will never implement bidirectional flows because a single biflow is nearly twice the size of two uni directional flows. In other words, some vendors claim there there is no advantage to biflows.
If you consider that a single TCP bidirectional connection could traverse multiple routers the result could result in a total of 8 flows being exported if the connection is seen on 4 different NetFlow exporting routers. If Deduplication and flow stitching is performed on these flows, conceivably the 8 flows could be boiled down to 1 flow with more fields. Ultimately this would result in faster reporting and less disk space consumed. In practice, deduplication and stitching has its benefits. The Scrutinizer system stitches flows together when the user runs a “bidirectional report” even across collectors. Apart from stitching and deduplication, forensic NetFlow investigations for threat detection generally require the original unaltered flows.