Blog :: Network Operations

Improving Real-Time Application Performance

adam

Real-time applications have redefined the world around us. We can now hold meetings with members distributed across the world, play games with long-distance friends we haven’t seen in a while, or binge our favorite shows over a streaming service without missing a frame. But while using these tools has made life easier for the end users, the same cannot be said for those responsible for managing the networks that support them. Whether it is a datagram-based media stream like RTP, or an interactive TCP-based session, network performance is key to ensuring these applications work. It is essential that teams have a good workflow for identification and resolution of issues that can degrade or interrupt service. This blog focuses on the metrics that are important when trying to improve real-time application performance.

RTP, or the Real-Time Transport Protocol, was designed to facilitate the use of UDP datagrams to deliver media to interactive applications. Under this protocol, individual streams of data are established and identified by their synchronization source ID or SSRC. Within these streams, media is transported between endpoint instances on a real-time application. Within the application itself, output from these streams are buffered and sequenced to allow for seamless playback on the other side of the conversation. Using a network probe like Plixer’s FlowPro APM, we can identify the key metrics related to the quality of these data streams, including packet loss and jitter.

How does packet loss affect RTP applications?

Packet loss degrades real-time application performance for the simple reason that data is lost. Since UDP is connectionless, the application itself would need a method to request data that was lost. But within the time constraints of real-time, this is not always possible.

Instead, some applications will attempt to interpolate the data that was missed. For example, if the last data received was 20ms of audio at a specific frequency, then the next packet is dropped before a third audio sample makes it successfully, the application may choose to smoothly ‘fill’ the missing data with a function using the last and next frequency value. This often leads to a robotic sound in VoIP apps, screen tearing in video apps, or running into walls in an online game.

What about jitter?

Jitter is the variance of inter-packet arrival time for an RTP session and monitoring it will improve real-time application performance. Since the media we send over RTP is very sensitive to timing, it is important that jitter be reduced to a level that can be managed by the end application. Jitter buffers or delay buffers are used within the application to solve this issue. Just as the name implies, these buffers slightly delay the forwarding of application data from the wire to allow for consistent inter-packet arrival and correct ordering. The width of this delay buffer is measured in milliseconds and is often configurable in your application when static. If you are using a dynamic jitter buffer, the width of the buffer will change dynamically based on observed jitter for the current connection.

Tuning jitter buffer to improve real-time application performance

How can we Monitor to Improve Real-Time Application Performance?

Now that we know what we are looking for when troubleshooting RTP performance issues, how can we find these metrics or review their historical behavior? With a probe like the Flowpro APM and a collection solution like Scrutinizer, you can quickly identify RTP issues in your network. You can track jitter metrics per SSRC and see total packet loss from specific hosts over time, plus a suite of metrics for reviewing TCP application performance. Now if a user opens a ticket complaining about VoIP call quality or a conference room video session, you can go back to the time of the event in Scrutinizer, enumerate the performance issues and either address link saturation or update your jitter buffer to allow more time to clean the signal. While these metrics are currently used for application performance monitoring, I wouldn’t be surprised to see a future where timing metrics are used in security operation to maintain tamper-free connections. Log in to a real-time application in your network and look for jitter/delay buffer settings, then using tools like Scrutinizer and FlowPro APM, check out how your applications are performing!