While investigating our SD-WAN value proposition with customers, I worked with one client who has Cisco IWAN for 250 branches uses Scrutinizer to monitor it all. I learned from the customer that they had to have the following SD-WAN performance reports.
- Whether the SD-WAN is moving connections appropriately between active connections and properly balancing the load
- Reporting Traffic Control Alerts (TCAs) when one of four metrics reaches a threshold:
- Packet Loss
- Byte Loss
- Which application or DSCP value is seeing the most TCAs
- Which SD-WAN device is triggering Immitigable Events (IMEs). An IME is when a TCA triggers a request for a traffic reroute, but nothing can be done because all alternate connections are seeing performance issues in one or more of the four metrics listed above.
- What interfaces at the branches have the most congestion alarms = TCAs
- Specifically what applications are affected the most by the TCAs?
- Which of the four metrics above are causing those traffic control alerts for each application?
- Is the spike in the metric caused by:
- The local SD-WAN router?
- One or two specific users or all users?
- The service provider (i.e. all users of the application are suffering behind the branch)?
Providing detailed reports on the above depends on the hardware’s ability to send or make available the necessary statistical information. This may not be in the form of flows. Plixer is known for NetFlow, but we have done lots of work in other technologies. Working with Viptela, for example, isn’t a problem because Viptela exports JSON, which our platform can ingest.
But reporting on the above metrics isn’t enough. How can we add even more value? The answer largely involves the ability to pivot when investigating an issue. For example:
- Knowing which application (i.e. DSCP) is struggling is helpful. We can add metadata to display how many users are affected.
- Having the IP address of the users impacted is good. Correlating IPs with usernames is much better.
- Knowing the IP address involved is good. The ability to narrow in on the exact ingress router saves troubleshooting time.
- Knowing the DSCP value is helpful. Knowing all the applications that are marked with that value is better.
- Cross-vendor support allows the partner to tout excellent support for existing customer hardware investments. This is improves end-to-end visibility.
Customers investigating in SD-WAN solutions often want to use their existing monitoring investments. SD-WAN vendors, however, are likely to push an in-house solution. The industry is moving toward open solutions and thankfully there are lots of SD-WAN vendors to choose from.