For anyone pursuing new opportunities in the real-time interactive video arena, there’s a lot riding on understanding that there doesn’t need to be any constraint on audio or video quality when WebRTC serves as the underlying streaming protocol. When talking with developers evaluating the Red5 Pro platform, we hear over and over again that WebRTC… Continue reading Debunking the Myth: 8 Reasons Why WebRTC is Capable of High Quality Audio and Video Today
For anyone pursuing new opportunities in the real-time interactive video arena, there’s a lot riding on understanding that there doesn’t need to be any constraint on audio or video quality when WebRTC serves as the underlying streaming protocol.
When talking with developers evaluating the Red5 Pro platform, we hear over and over again that WebRTC can’t handle high quality video and audio. Much of our time is spent educating and debunking the myth that WebRTC isn’t capable. To assume the use of WebRTC imposes limits on A/V quality is to risk losing out on the rewards that come with developing applications for this market. That adds up to one huge risk.
As reported in the white paper titled “The World Needs an Interactive Real-Time Streaming Infrastructure,” demand for services supporting video-rich interactivity with no perceptible latency across any number of endpoints at any distance is exploding across cyberspace. That demand reaches myriad applications in entertainment, social media, business, education, medicine and all levels of government, including the military. There’s no better way to meet this demand than through a platform that can flexibly put WebRTC to use as the best way to reach the lion’s share of end points, no matter the A/V quality requirements.
1. Video Conferencing Systems Are The Wrong Gauge for Judging WebRTC A/V Quality
This might come as a surprise to those who assume WebRTC-supported A/V quality limits are reflected in the generally poor quality typical of the leading video conference (VC) platforms. Google Meet, Cisco WebEx, Microsoft Teams and Facebook Messenger Rooms are among the many brands that heavily rely on WebRTC to connect end users in real time.
This makes sense, given the fact that WebRTC is supported by all the leading browsers, including Chrome, Edge, Safari, Firefox, and Opera. Even Zoom, which processes A/V locally using WebAssembly rather than browser-based processing, uses WebRTC to capture the A/V channels and transmit them at imperceptible latencies over the network.
More specific to developers, CPaaS offerings like the Twilio Video API, Agora, and Vonage API (formerly known as TokBox) are all optimized for video conferencing and thus suffer A/V quality issues, particularly when using the default settings of these platforms. This leaves developers with the perception that WebRTC itself is the cause.
The less-than-ideal, often funky quality of video conferencing experiences is a function of bandwidth and other limitations these platforms impose, not to do with any quality restrictions intrinsic to WebRTC. As discussed in many previous blogs, more accurate demonstrations of what can be done with WebRTC can be found in use cases requiring much higher levels of quality. This includes real-time streaming and interactive engagement with live sports, esports and concerts, live shopping platforms, surveillance camera outputs in military/aerospace and other types of applications, and much else.
These use cases, all supported by the Experience Delivery Network (XDN) platform developed by Red5 Pro, not only dispel any doubts about A/V quality over WebRTC; they also underscore the real-time one-to-many, many-to-one and many-to-many levels of video-rich interactivity enabled by XDN technology that go beyond the reach of conventional HTTP-based streaming.
2. “Expert” Opinions Suggesting WebRTC Impediments to Video Quality Are Bunk
Unfortunately, failure to take such examples into account isn’t limited to those who mistakenly
use VC platforms as their A/V quality reference points. Inaccurate and misleading information from “experts” remains a significant cause for confusion as well.
One case in point appeared in a recent StreamingMedia post claiming the aforenamed browsers supporting in-browser encoding of WebRTC streams impose a 2 Mbps ceiling on their encodes. That’s simply not true. In fact, the referenced platform in the article is Millicast, and even they clearly say this isn’t so: “By default, there are no bitrate limitations imposed on a WebRTC stream, and the publishing client will attempt to push the highest quality stream possible based on the capabilities of the hardware and available bandwidth.”
Perhaps the confusion is that Milicast actually recommends forcing the browser to limit the bandwidth because of their use of Simulcast. It’s likely that their examples and default settings are limiting the top bandwidth to 2 Mbps via the SDP manipulation described in the above article.
To further drive this home, Red5 Pro’s widely followed recommended setting for in-browser video encoding for 1080p HD is 3 – 6 Mbps. No one is complaining this doesn’t work. The only limit on the encoding output data rate involves making sure the streams don’t consume too much bandwidth on the targeted access links. We encourage you to try it yourself and see.
3. Browser-Imposed Limits on Video Quality Aren’t Decisive
While it’s essential to avoid misconceptions about A/V quality over WebRTC, it’s also important to recognize that the protocol is a tool of convenience, not of necessity, for implementing a platform that can meet any requirement critical to executing any interactive video streaming application in real time. WebRTC is the ideal streaming mode when it can be used to fully support client execution in accord with an application’s quality parameters through one of the WebRTC-compatible browsers.
But it’s important to recognize that the range of video codecs supported by those browsers is limited. Only two video codecs–AVC and VP8, both mandated by WebRTC specifications–are supported by all leading browsers, which is all that’s needed if the highest encoded resolution is 1080p. In addition, Chrome, with its support for VP9, and recent versions of Safari supporting High Efficiency Video Coding (HEVC), are equipped to provide compression suited to streaming 4K.
Use of in-browser encoding obviates the need for distributors to pay for in-house encoding. But Red5 Pro customers can use their own encoders from third-party vendors, or they can feed video directly to the transcoding software instantiated with Red5 Pro’s origin servers. The XDN can ingest video formatted to all the leading protocols used with video playout. This includes Real-Time Streaming Protocol (RTSP), Real-Time Messaging Protocol (RTMP), MPEG-TS, Secure Reliable Transport (SRT) and HTTP Live Streaming (HLS).
It’s worth noting that the use of a cloud based real-time transcoder for WebRTC published streams is a major improvement over the Simulcast mentioned earlier. Rather than having the client have to generate each variant, and thus limiting the bandwidth of each stream, the browser can be configured to send a single stream using the highest available bitrate.
Thus, there is no limit on video resolutions, quality, and bandwidth efficiency, provided client devices are equipped to handle any decoding that can’t be executed in their browsers. The XDN platform can stream any video format at any distance, in any direction across virtually any combination of public and private clouds with end-to-end latency no greater than 400ms.
XDN architecture is designed to take full advantage of the Real Time Transport Protocol (RTP), which is the underlying transport protocol supporting both WebRTC and RTSP as well as IP voice communications. In instances of client compatibility with either of these protocols, the XDN selects which one to use on a session-by-session basis.
WebRTC, with its browser-enabled plug-in free accessibility, is used in most implementations. But RTSP is the better choice for mobile devices, because it requires fewer resources on both client and server, and has a shorter and simpler signaling process than WebRTC while providing the same level of encryption and security.
The client-optimized flexibility of XDN architecture also extends to packaging ingested RTMP, MPEG-TS and SRT encapsulations for transport over RTP. This occurs when clients compatible with these protocols can’t be reached via WebRTC or RTSP.
4. The Quality Support Afforded by Multi-Profile Streaming Can Be Retained
Moreover, XDN architecture makes it possible to retain the benefits of streaming content in multiple bitrate profiles. While streaming on the XDN platform operates in the push mode used with WebRTC and RTSP, not the HTTP pull mode, the multi-profile requirements used in adaptive bitrate (ABR) streaming can be satisfied with ingestion of those profiles from an external transcoder, or by using transcoding positioned with XDN origin servers to play out content ingested as a single profile in the multiple profiles used with ABR.
In either case, the XDN Origin Nodes stream the ABR ladder profiles over the RTP-based transport system through the Relay Nodes to Edge Nodes. Then the content is streamed in profiles matched by node intelligence to each session in accord with client device characteristics and access bandwidth availability.
When it comes to making use of WebRTC, the XDN platform ensures the protocol is exploited to the fullest extent in support of superior video quality. At the same time, the platform provides all the support needed for use cases that depend on achieving next-gen levels of quality.
5. The WebRTC-Mandated Opus Codec Is Ideal for Delivering High-Quality Audio
Reviewing audio quality over WebRTC, the fallacy of judging what can be accomplished by the performance of VC platforms is just as detrimental to successful use of the protocol as it is with video quality assessment. Because these platforms have the luxury of exploiting the extremely low-bitrate settings for voice communications enabled by the WebRTC-mandated Opus audio codec, they can attain voice-caliber quality at bitrates as low as 8 Kbps.
But that says nothing about the media-caliber audio quality achieved by using Opus, which is supported by all the WebRTC-compatible browsers. This is an open-source, royalty-free codec supporting bitrates ranging from 6 Kbps to 510 Kbps with sampling rates ranging from 8 KHz to 48 KHz.
When supporting narrowband voice at the lowest bitrates, Opus limits sound replication to the 300 Hz to 8 KHz ranges that are minimally sufficient for conveying human speech. Replication is extended to the full 20 KHz range of sound audible to humans when the codec is used to compress entertainment-caliber sound.
Thus, Opus is perfectly suited to delivering stereo sound with sampling set at 48 KHz and bitrates at 96 Kbps or higher. This matches the sound quality achievable with the most widely used audio codec in online and broadcast entertainment, Advanced Audio Coding (AAC), which is the audio component of the H.264 standard.
6. Browser Defaults to Speech-Level Quality Can Be Disabled
There’s just one catch when it comes to compressing audio for streaming over WebRTC. Distributors, whether they rely on in-browser encoding, their own encoders, or transcoding on the XDN platform, must be sure to deactivate the echo and noise cancellation functions that eliminate unwanted interference when microphones are conveying the audio. By default, when browsers register the presence of noise and echo cancellation, they apply the Opus parameters used with speech, thus cancelling out compression of the sound spectrum above 8 KHz.
Widespread lack of awareness of this is fueled by the fact that many WebRTC platform providers’ SDKs nail up the echo and noise cancellations in their abstractions of the audio layer, making it impossible for users to experience anything but speech-caliber audio. In contrast, the XDN HTML5 SDK makes it easy to turn off echo and noise cancellation with a simple click of a command button.
7. WebRTC Can Stream Uncompressed Pristine-Quality Audio
It’s important to note that WebRTC can be used to deliver the most pristine audio quality attainable by avoiding use of compression altogether. The XDN platform supports digital delivery of audio as directly sampled by pulse code modulation (PCM) from analog signals straight through to digitally connected speakers, which are designed to convert PCM back to analog for playback.
There are many real-time interactive video streaming use cases where there’s a need to avoid the compromises in audio quality that are inevitable with even the best compression systems. A truly live-caliber concert experience with real-time audience interactions via video is one example where use of raw PCM audio can be a huge differentiator. Podcasting is another. Remotely connected participants in real-time professional recording sessions also need this level of quality.
In fact, given the explosion in access to bandwidth worldwide, it’s not hard to imagine that uncompressed audio could eventually become the norm in any scenario where audio quality really matters. Uncompressed audio with PCM sampling rates at 48 KHz can be streamed at 2.5-3 Mbps, which is not the big drain on bandwidth it once was. One estimate of bandwidth usage in the U.S. reports 78% of residential wireline subscribers are connecting at 100 Mbps or higher speeds. Another tracker says average access bandwidth now tops 25 Mbps in the U.S. and 31 other countries, with eight exceeding an average of 40 Mbps.
8. Activation of Stereo and Surround-Sound with WebRTC
When there’s a need for uncompressed audio, it can be accommodated in real time over WebRTC on the XDN platform. It’s also important to note that another limitation on compressed audio streaming imposed by browsers can be overcome when XDN technology is used with WebRTC.
As mentioned, Opus supports stereo. In fact, it supports up to 256 separate audio channels. But RTP payload specifications don’t go beyond two audio channels, and only two browsers—Chrome and Firefox—support encoding for stereo. Only one, Firefox, supports stereo playback.
But for those who want to move beyond these limitations, Red5 Pro is prepared to implement a solution known as “Multiopus” that’s been developed but not promoted by Google. This workaround overcomes the two-channel limitations of RTP, enabling stereo on Chrome and even surround sound at the popular 5.1 ratio and the more advanced 7.1 level. Both levels are supported by Dolby Digital and Atmos, DTS and other audio formats.
While this innovation is limited to implementation with Chrome, it does open a path for use of Opus compression on WebRTC to deliver home-theater quality sound. Developers can rest easy when it comes to concerns about achieving the highest levels of audio quality over WebRTC.
For more information on the boundless A/V quality options available for real-time interactive streaming on the XDN platform contact info@red5.net or schedule a call.