CMAF vs. WebRTC: 5 Questions Answered

SHARE

We’ve already covered details of how CMAF and WebRTC deliver streams, but what about a direct comparison of the performance of CMAF vs. WebRTC. This post will cover just that. First, let’s cover what they are: What is CMAF? CMAF is a standardized container designed to package video, audio, or text data that is delivered… Continue reading CMAF vs. WebRTC: 5 Questions Answered


We’ve already covered details of how CMAF and WebRTC deliver streams, but what about a direct comparison of the performance of CMAF vs. WebRTC. This post will cover just that.

First, let’s cover what they are:


What is CMAF?

CMAF is a standardized container designed to package video, audio, or text data that is delivered using HTTP based streaming protocols: HLS, LHLS, or MPEG-DASH. HTTP protocols divide a video stream into small chunks which reside on a HTTP server. In this way the individual video chunks can be downloaded by a video player via TCP.

CMAF defines the segment format as well as codecs and media profiles that will be used to deliver the stream.

The advantage of CMAF is that media segments can be referenced simultaneously by HLS playlists and DASH manifests. This allows for the support of various devices without having to store multiple copies of the same stream content with different encoding for different formats.  CMAF eliminates the need for redundant storage of the same video file by using one file format that could be used by different devices.

A big disadvantage is that (like other HTTP based protocols) it cannot produce a low enough latency for real-time live streaming with a latency of 2-3 seconds. We will cover latency in more detail later on.


What is WebRTC?

WebRTC (Web Real-Time Communication) is a standards-based, open-source project supported by Google, Microsoft, Mozilla, Opera, and Apple. Essentially, it is a media engine with a JavaScript API on top of it.

It works directly in a web browser without requiring additional plugins or downloading native apps. It does not use HTTP to send any media. Instead, it establishes a connection using UDP and delivers encrypted video over RTP. As such, WebRTC produces the lowest possible latency of 500 milliseconds or less.


How Do They Establish a Connection?

CMAF (as an HTTP based protocol) uses the standard HTTP protocol to send media requests between web servers and clients. As a connectionless text based protocol, each request requires opening a separate connection using TCP to transfer requests and then closing it once the request has been serviced.

WebRTC creates a connection through through the RTCPeerConnection which is a more complicated process than HTTP. First, the browsers are connected through signaling, by passing the Session Description Protocol (SDP) and exchanging ICE candidates. Then encryption keys are exchanged using DTLS (Datagram Transport Layer Security).  This signaling discovers where the two users are and how to connect. Signaling takes place over an HTTPS connection or a WebSocket (Red5 Pro uses a WebSocket) and is then implemented via JS code. As this process creates a direct connection between two web browsers, it is considered a peer-to-peer connection. This process has been covered in detail in a past article (and a really deep-dive here).

One very important difference is that CMAF (through HTTP) creates a TCP connection while WebRTC is a UDP based connection. This has important consequences in regards to the results they produce, namely when it comes to latency. We will cover that in more detail later on.


How Do They Send Media?

In order to transmit each CMAF Segment, a POST request containing a CMAF Header is sent to the ingest origin server. Immediately after each CMAF Chunk completes encoding and packaging, it is sent through HTTP 1.1 chunked transfer encoding. This means that each segment can be progressively delivered as each chunk is ready rather than waiting for the entire segment to load before it can be sent out.

Then, the chunks ingested into the origin are delivered over HTTP chunked transfer encoding to a CDN where Edge servers make them available to the players which will eventually display the media. In order to retrieve the segment, the player uses the manifest or playlist associated with a stream to establish a connection with the correct Edge and then it makes a GET request.

WebRTC goes through “media channels”. These use either SRTP (for encrypted voice and video) or SCTP (for the encrypted data channel). The actual video itself is sent using different codecs. In the case of Red5 Pro, the primary codecs used are H.264 for video and AAC for audio, However, other codecs (such as VP8) can be supported as well.


Is There a Difference in Latency?

Definitely.

Despite the fact that both CMAF and WebRTC were created with the intention of reducing latency, WebRTC is a new method designed specifically to produce the lowest latency possible. CMAF on the other hand, is a reworking of an older streaming method.

CMAF was designed to lower the incredibly high latency involved in HTTP delivery without having to rethink the overall strategy of using HTTP. This also made it easier on CDNs since they didn’t have to change their entire strategy and infrastructure. However, this approach is still not not low enough for real-time interactivity. No matter what is done to CMAF it’s still based on the older (and slower) HTTP method. As such, this makes CMAF a stop-gap solution rather than an actual fix the problem of latency in live streaming.

Unlike TCP based HLS and MPEG-DASH, WebRTC is UDP based. UDP is not concerned with the order of the data, rather it delivers each packet to the application the moment it arrives. Instead of queuing packets and waiting for them to load like TCP based protocols, WebRTC focuses on the dropped packets. This is done either by using NACK to retransmit the most critical packets or by Packet Loss Concealment – estimating to some extent what should have been in the missing packet.

Those already familiar with WebRTC may correctly point out that WebRTC can also deliver over TCP. However, since this method is mostly used as a back up when firewalls block UDP traffic, we don’t dwell on it in this post.

In real world tests, CMAF produces 2-3 seconds of latency, while WebRTC is under 500 milliseconds. This makes WebRTC the fastest, streaming method.

This is why Red5 Pro integrated our solution with WebRTC. It establishes secure, plugin-free live video streams accessible across the widest variety of browsers and devices; all fully scalable into millions of concurrent connections while maintaining sub-500 milliseconds of real-time latency. WebRTC streaming provides the best possible experience.

Check out what Red5 Pro can do on our demo page. To contact us directly, please send an email to info@red5.net or schedule a call. We’d love to get you live streaming!