Blog :: Uncategorized

Cisco ASR 1001-X overloading QFP

jeff

Whether you work primarily on the networking side of the house or the security side, you’ll need to ingest metadata into at least one of the tools in your toolbox. In my experience, it’s often the same data sets are often being generated multiple times from the same raw packet. This can put an extreme load on your exporting device. This happened recently with a customer, and in the process, we uncovered a bug specific to Cisco’s ASR 1001-X platform running IOS XE 3.16.x. I’ll discuss what this bug was, the issues it caused, and how it can be alleviated in the future.

Summary of the issue with ASR 1001-X

The customer was beginning to experience performance degradation on both of their WAN routers (input errors, overruns, unknown protocol drops on the WAN interface) and they had higher than average latency across the WAN interface as well.

After opening a TAC case and collecting some debug output, the end user and the Cisco engineer determined that at times of high utilization (which was still less than half of the link speed), they were seeing backpressure on the QFP (99% QFP utilization). This indicated that they were overloading the QFP, resulting in overruns. According to the Cisco engineer, this can occur when the QFP is unable to pull packets from the interface buffer.

In the end, the issue was raised by having multiple NetFlow configurations defined on the router. In the case of this particular user, they removed flow configurations that were no longer needed and that reduced the QFP utilization tremendously. Fortunately, that was a sufficient workaround in this case, but as I mentioned above, that certainly wouldn’t be acceptable in the cases where the additional flow configurations are a necessity.

Now I should mention that Cisco has since resolved this issue with the software release of Denali 16.x (16.6 was recommended). Is there anything else we could do though? Of course!

Plixer’s Replicator

The Replicator is specifically developed for the exact kind of use case we have with the Cisco ASR 1001-X bug. At a very high level, the Replicator is a UDP forwarder. Upon receiving a single stream of UDP data, the Replicator can duplicate it and forward to multiple destinations while spoofing the original source address (or not, the option is up to you!).

While the Replicator would have absolutely alleviated the issue this user was experiencing, it’s also a valuable tool for any UDP data: NetFlow data, syslogs, or event logs or SNMP traps. This will vastly reduce the burden put on your exporting devices when multiple destinations are a requirement.

While the Replicator is a great solution for duplicating UDP data, it’s also a great tool for managing your exports during data center migrations or a hardware refresh. By exporting your UDP data to a single IP, you can manage the destination(s) through the UI without having to touch all of your devices individually.

Cisco ASR 1001-X overloading QFP: Replicator profiles

The Replicator can contain any number of profiles, and each profile can contain any number of exporting devices controlled by policies and any number of collectors. Each profile is configured with a separate listening port and transmitting port, as well as a policy to include or exclude exporting devices. The policies are defined by CIDR notation. In the screenshot above, you’ll see you can define them as an explicit /32 or you could broaden them to include a particular subnet.

The Replicator can be deployed as either a virtual appliance or dedicated hardware, and the only limitation to throughput is the packet line rate.

If you’re interested in learning more about our Replicator appliances, don’t hesitate to reach out to us. You can also always download our virtual appliance and enjoy a free and fully supported 30-day evaluation!