Dissecting the XDN Live Video Streaming Architecture

Terraform Blog - Identical Houses

We might have mentioned a few things about experience delivery networks (XDNs) in previous posts, how they’re revolutionizing live streaming by supporting a new era of interactivity that remains beyond the reach of oldfangled HTTP-based content delivery networks (CDNs). XDNs use a cloud-based server infrastructure to deliver multidirectional live streams with under 500 milliseconds of… Continue reading Dissecting the XDN Live Video Streaming Architecture

We might have mentioned a few things about experience delivery networks (XDNs) in previous posts, how they’re revolutionizing live streaming by supporting a new era of interactivity that remains beyond the reach of oldfangled HTTP-based content delivery networks (CDNs).

XDNs use a cloud-based server infrastructure to deliver multidirectional live streams with under 500 milliseconds of real-time latency at scale. This allows them to deliver fully interactive experiences to the widest possible audience for a variety of applications from sports broadcasts with live betting and fan interactions to conference applications, drone streaming, and live shopping to name a few.

But how does an XDN actually deliver these experiences? This post breaks down the Red5 Pro XDN live video streaming architecture.

Geographic Distribution at Scale

A typical XDN infrastructure consists of a software stack deployed in three-tiered clusters, often across multiple public or private cloud environments.

Each cluster consists of three types of nodes: origin, edge, and relay. Depending upon the use case or needs of an application, each cluster may have multiple instances of the same node type. Origin nodes ingest content and encode it for transport to relay nodes, each of which serves an array of edge nodes that deliver live unicast streams to their assigned service areas.

To achieve full geographic distribution, the XDN taps into public or private cloud facilities across the world that can deliver real-time video experiences simultaneously to any number of clients at any distance (fig. 1). It leverages virtual machine–based iterations of data center virtualization, and thus offers the high degree of flexibility and speedy resource utilization that is essential to seamlessly scale cross-cloud operations.

Figure 1. A simplified depiction of an XDN architecture.

One of the key components of this type of XDN architecture is a manager that organizes all the incoming and outgoing streams. In the Red5 Pro implementation of an XDN, a stream manager oversees the configuration of the streaming architecture and orchestrates all the connections among the various origin, relay, and edge nodes, both within an individual server cluster as well as among all the clusters that are spun up based on traffic. The stream manager processes live-stream information to respond to infrastructure needs in real time. It applies automated scaling mechanisms, adding or removing server nodes according to fluctuations in traffic demand in order to add new broadcasters and subscribers as needed (fig. 2).

Figure 2. In a Red5 Pro XDN, the stream manager spins up and provisions a new instance to add an edge node to an existing cluster based on demand.

Within this topology, any given origin node ingests incoming streams and communicates with multiple edge nodes to support thousands of participants. For larger deployments, origin nodes can stream to relay nodes, which in turn stream to multiple edge nodes to scale the cluster even further to realize virtually unlimited scale.

Implementing an XDN with Red5 Pro also supports so-called mixers that can be deployed between broadcasters and origin nodes, to combine multiple streams into one stream that is then passed on to an origin node. This creates more functionality while maintaining efficient streaming. Mixers can be used for ad insertion, remote production, and other custom features.

Red5 Pro–based XDNs support cross-cloud operations by a variety of platforms. Pre-integration with large providers such AWS, Microsoft Azure, Google Cloud Platform, and DigitalOcean provides support for widely used platforms that many software engineers may already be familiar with. Further flexibility and accessibility is achieved through the integration with Terraform — the open-source multicloud toolset provided by Hashicorp — which opens support for dozens of other infrastructure-as-a-service (IaaS) platforms.

Terraform facilitates cross-cloud instantiations by translating IaaS resources into a high-level configuration syntax that allows IaaS APIs to be abstracted for access through a Terraform Cloud API common to each cloud operator. Thus, by leveraging those APIs, a Red5 Pro XDN can mix and match to select the best platform or combination of platforms for each targeted region where streams need to be delivered. In addition, the Red5 Pro stream manager can be manually integrated to work with the APIs of any cloud provider that isn’t integrated with Terraform.  

The automated node configuration and routing capabilities of the XDN architecture, coupled with extreme low latency, enable multidirectional streaming where everyone can subscribe to everyone else’s stream or become a broadcaster in real time. In contrast, a traditional CDN infrastructure only works really well for higher latency, unidirectional live streaming — publishing a one-to-many broadcast to a large number of subscribers. Unlike XDNs, they can’t deliver the excitement of live events and activities which depend on participants’ low-latency connections.

Fail-Safe Performance

Streams can fail, even with fast internet connections. To create a high degree of stability, XDNs built with Red5 Pro use the stream manager’s autoscaling mechanism to create cluster-wide redundancy. When a node goes offline or malfunctions, all the processing can be moved to another node without any disruption to the flow or increase in latency.

These capabilities also apply to load balancing on a stream manager. This means that in the event of anticipated heavy traffic, more than one stream manager can be set up behind the cloud platform’s load balancer service. This ensures that traffic requests such as (broadcast / subscribe) are evenly distributed among multiple stream manager instances to prevent flooding of requests on a single instance.

Real-Time Streaming with RTP

The Real-time Transport Protocol (RTP), which is used for IP-based voice communications, provides an ideal mechanism for building XDN infrastructures that support multidirectional, real-time video streaming at any scale and distance. RTP is the foundation for both WebRTC (Real-Time Communications), originally developed for peer-to-peer video communications and now used for plugin-free browser application, and RTSP (Real-Time Streaming Protocol), a one-to-many video streaming alternative to HTTP that became an IETF standard in 1998 and now commonly used in mobile native apps. Both protocols can be configured with Red5 Pro software to create an XDN featuring sub 500 milliseconds of video streaming latency.

In a typical configuration, RTP relies on the User Datagram Protocol (UDP), which is used primarily for low-latency connections and loss-tolerating connections where packets are sent directly to the recipient, without error checking and regardless of the order in which they are received. In an out-of-the-box configuration of UDP, the sender will not wait for confirmation that packets have been received and keep transmitting packets without error recovery or retransmission of dropped packets.

In contrast, the transmission control protocol (TCP) — used with HTTP-based CDNs — will resend any dropped packets resulting in a buffered queue of packets waiting to be resent. Since TCP creates a buffer and UDP does not, UDP creates a lower, real-time latency for any type of video and audio content.  

While maintaining low latency is important, you don’t want the entire stream to become garbled if too many packets are dropped. Missing one here or there is fine, but entire chunks dropping out will be an issue. In order to maintain a consistent and smooth user experience, the Red5 Pro implementation of WebRTC uses negative acknowledgment (NACK) messaging: If too many packets have been dropped, the client can alert the broadcaster; however, only if those packets are determined to be important will they be retransmitted. NACK works in tandem with forward error correction (FEC) to smooth out streaming while maintaining the lowest possible latency.

Along with ingesting any content delivered via WebRTC or RTSP, the XDN can ingest video from other leading protocols, including RTMP (Real-Time Messaging Protocol), SRT (Secure Reliable Transport), and MPEG-TS (Transport Protocol). These are packaged for streaming on the RTP foundation with preservation of the original encapsulations for egress to clients that can’t be reached via WebRTC or RTSP.

ABR and Transcoding

Chances are not all stream participants will have a great internet connection. Slower network speeds or insufficient bandwidth can cause the stream to freeze, stutter, and ultimately disconnect. With live content, you need everything to stay in sync with all the other participants so you can’t just add a buffer to allow everything to catch up like you would with VOD content. Downgrading the entire stream would work for the participant with the worst connection but that would also downgrade the experience for all the other subscribers that can handle a better stream quality.

The best approach to ensure all subscribers enjoy the same live-streaming experience is to use adaptive bitrate streaming (ABR) and transcoding to create multiple stream variants (1080p, 720p, etc.). Red5 Pro has implemented ABR with WebRTC on the server-side by creating multiple resolutions so that the server can automatically select the best quality for the client’s current network conditions. ABR allows for dynamic upgrading and downgrading of the stream quality based on network conditions. That way each user can enjoy the smoothest possible live-streaming experience without having to think about it.

The process involves publishing a single high-quality stream to a transcoder node, which then generates multiple variants at lower quality with configurable bitrate and resolution settings. Many applications will use three variants — high, medium, low — although any number can be generated. These variants are then streamed from an origin to an edge. The edge determines the highest-quality stream possible for each viewer’s device and connection speed and serves the correct variant to the subscriber. As with all Red5 Pro setups, the entire process is orchestrated by a stream manager.

XDN technology represents a major achievement for live streaming. The interactive experiences unlocked with an XDN live video streaming architecture are paving the way for expanding current use cases and creating new ones. While CDNs served an important role in the initial establishment of streaming video on the internet, the underlying weakness of being built on top of HTTP-based protocols means that CDNs will never be able to support multidirectional streaming at scale with real-time latency. Thankfully, we can count on XDNs to move live streaming forward.

Interested in setting up your own XDN with Red5 Pro? Contact us at info@red5.net, or schedule a call. We’d love to show you how you can become part of the next generation of interactive live streaming.