video encoding – Bitmovin https://bitmovin.com Bitmovin provides adaptive streaming infrastructure for video publishers and integrators. Fastest cloud encoding and HTML5 Player. Play Video Anywhere. Mon, 02 Dec 2024 01:25:03 +0000 en-GB hourly 1 https://bitmovin.com/wp-content/uploads/2023/11/bitmovin_favicon.svg video encoding – Bitmovin https://bitmovin.com 32 32 Multiview HEVC (MV-HEVC): Powering spatial video experiences and more https://bitmovin.com/blog/mv-hevc-encoding/ Mon, 02 Dec 2024 01:24:58 +0000 https://bitmovin.com/?p=293792 The world of video technology is constantly evolving, and one of the more interesting developments in recent years is the story of MV-HEVC (Multiview High Efficiency Video Coding). Even though it was added to the HEVC specification in 2014, MV-HEVC didn’t see much commercial use for almost a decade.  That changed when Apple launched the...

The post Multiview HEVC (MV-HEVC): Powering spatial video experiences and more appeared first on Bitmovin.

]]>
The world of video technology is constantly evolving, and one of the more interesting developments in recent years is the story of MV-HEVC (Multiview High Efficiency Video Coding). Even though it was added to the HEVC specification in 2014, MV-HEVC didn’t see much commercial use for almost a decade. 

That changed when Apple launched the Apple Vision Pro, announcing that unlike Meta Quest and other headsets, their new device would take advantage of MV-HEVC for immersive video experiences. In this blog post, we’ll explore what MV-HEVC is, its potential for enhancing streaming experiences and how to get started. 

What is MV-HEVC?

MV-HEVC stands for Multiview High Efficiency Video Coding, an extension of HEVC that was added to the second edition of the standard in 2014. It’s designed to support the efficient encoding of multiview video content captured from multiple viewpoints, often to create stereoscopic (3D) effects or spatial video experiences for virtual reality (VR) and augmented reality (AR). 

Doubling the encoding and bandwidth requirements for multiple viewpoints could potentially create buffering and playback issues, but MV-HEVC enables the efficient compression and storage of stereoscopic content, reducing the bandwidth required for streaming or the file size needed for storage without compromising the video’s quality.

In short, MV-HEVC allows the encoding of multiple views of the same scene in a way that preserves video quality while keeping the bitrates manageable. This makes it a good fit for 3D, AR and VR applications that require a lot of real-time data processing. 

How MV-HEVC works

Before getting into how MV-HEVC works, let’s take a quick step back to the basics of video encoding. Temporal compression is a technique for reducing file size that is common to all major video codecs. Unless there is a scene change, individual frames of video are usually not that different from one frame to the next. Temporal compression exploits that fact and reuses data where it can, saving some bits from being encoded and shrinking the file size. 

This is done by encoding different types of frames that require less data to reconstruct for playback. I-frames are fully encoded frames that serve as anchor points, while P-frames (Predictive frames) can reuse data from frames that came before them. B-frames (Bi-direcional predictive frames) can reuse data from frames both before and after them. If you’re interested in learning more about some of the fundamentals of video encoding, check out this guide

I touched on all of that because a key benefit of MV-HEVC is that it is also able to take advantage of the commonalities across multiple camera angles or views. In the cases of immersive and 3D videos that are created with different views for the right and left eye, the similar viewpoints usually mean there’s a lot of potential for compression, creating smaller, more manageable files for streaming and storage.

- Bitmovin
Example multiview prediction structure, with cross references between views – Image source: Fraunhofer HHI

Applications of MV-HEVC

Stereoscopic Video (3D Video)

MV-HEVC is particularly useful in the realm of 3D video or stereoscopic content, where two slightly different views (one for each eye) create the stereoscopic effect. By encoding both the left eye and right eye views efficiently in a single stream, MV-HEVC reduces the file size and bitrate compared to other methods. This is crucial for streaming applications like 3D movies or immersive VR experiences where quality and efficiency are key. Other codecs can be used for 3D stereoscopic video as we cover in this blog, but MV-HEVC is more efficient. 

Screenshot of a stereoscopic video frame where the left eye and right eye have distinct views, something supported by MV-HEVC
Top-Bottom Stereoscopic Format source: Blender Foundation

Spatial Video

Another application of MV-HEVC is in spatial video, which is typically used for virtual reality (VR) or augmented reality (AR) content. The Apple Vision Pro is built around the idea of capturing and presenting spatial video, allowing users to immerse themselves in a three-dimensional representation of a scene, combining video and depth information. MV-HEVC support is essential for these types of experiences, reducing massive bitrates of the raw files into something manageable for streaming and real-time immersive experiences. 

- Bitmovin
Side-by-side lenses on the iPhone 15 Pro and iPhone 16 allow for native capturing and recording of MV-HEVC spatial video

Multiview Video

MV-HEVC is also important for multiview video, where multiple views of the same scene are captured from different angles. This could be used in sports broadcasts, where different camera angles are encoded into a single video stream, or for applications that allow users to choose their viewing angle interactively. Depending on your exact use case, this may require multiple decoders or extra processing power that might not be available on all platforms. 

- Bitmovin
Example multiview player, now supported by Bitmovin on some platforms

Dolby Vision with MV-HEVC

MV-HEVC is now also compatible with Dolby Vision, a popular High Dynamic Range (HDR) video format that helps ensure content looks as realistic and as true to the creator’s vision as possible. Most of the top-tier premium streaming content these days is being made available in Dolby Vision format, so it makes sense that companies investing in MV-HEVC production pipelines would want to take advantage of Dolby Vision. Dolby Vision Profile 20 extends the potential quality enhancements of Dolby Vision to MV-HEVC and immersive content. 

Apple Vision Pro and beyond

The Apple Vision Pro is pushing the boundaries of immersive media and while they didn’t create the VR headset segment, Apple definitely put their stamp on it. There are several examples over the years of Apple’s influence on the media technology industry, from their decision to not support Flash video to their decision to (finally) support AV1. 

It seems only likely there will be a halo effect for MV-HEVC around the Vision Pro. One early example is the Blackmagic URSA Cine Immersive camera. I expect in 2025 we’ll see more companies venturing into MV-HEVC support from capture to post-production to distribution. 

- Bitmovin

MV-HEVC video tools

Direct recording with Apple Vision Pro and iPhone

You can record spatial video using MV-HEVC directly on the Apple Vision Pro, iPhone 15 Pro and all iPhone 16 models. The distance between the 2 camera lenses on the Vision Pro seems to provide better results with more depth compared to spatial videos captured on iPhone.

Apple AVFoundation support

Apple also added support to their AVFoundation APIs for converting side-by-side 3D videos into MV-HEVC and spatial videos. You can find more information in their developer documentation here.

Bitmovin VOD encoding beta

Bitmovin’s VOD Encoding now supports MV-HEVC as part of a private beta. If you’re interested in adding MV-HEVC to your transcoding workflows, we’d love to discuss the details with you. You can reply in the Bitmovin Community, comment on this post or get in touch with your Bitmovin contact for more info. 

Conclusion

Thanks in large part to Apple, MV-HEVC is poised to become a key technology in the future of immersive and multiview content. Its ability to efficiently encode multiple views of the same scene, reduce the data required, and maintain high video quality makes it an essential tool for everything from stereoscopic 3D movies to virtual reality experiences on devices like the Apple Vision Pro.

On their other platforms, Apple seems to have signalled a shift toward using the AV1 codec, but AV1 does not currently have multiview support. It will be interesting to see how that situation evolves both within Apple’s products and the wider video ecosystem. While the only certainty is that things will change, unless Apple abandons the Vision Pro, MV-HEVC is likely to be part of the picture for the foreseeable future.

The post Multiview HEVC (MV-HEVC): Powering spatial video experiences and more appeared first on Bitmovin.

]]>
3-Pass encoding enhances video quality, making every bit count https://bitmovin.com/blog/3-pass-encoding/ Fri, 27 Sep 2024 03:33:05 +0000 https://bitmovin.com/?p=288152 Introduction Bitmovin’s VOD Encoder is known for its quality, speed, and cloud-native ability to scale quickly and resiliently. Advanced features like split-and-stitch encoding with Smart Chunking,  Per-Title and 3-Pass encoding set it apart from other encoders on the market, in terms of both visual quality and bitrate efficiency. For our customers, it means lower costs...

The post 3-Pass encoding enhances video quality, making every bit count appeared first on Bitmovin.

]]>
Introduction

Bitmovin’s VOD Encoder is known for its quality, speed, and cloud-native ability to scale quickly and resiliently. Advanced features like split-and-stitch encoding with Smart ChunkingPer-Title and 3-Pass encoding set it apart from other encoders on the market, in terms of both visual quality and bitrate efficiency. For our customers, it means lower costs for storing and delivering video, along with a better experience for their viewers. 

In this post, we’ll explain how Bitmovin’s 3-Pass encoding works and show the benefits of using 3-Pass encoding with Bitmovin. 

How does 3-Pass encoding work?

As you might have guessed, with 3-Pass encoding the analysis and encoding optimization happen in 3 phases. After our split-and-stitch algorithm (now with Smart Chunking) divides the source file into separate chunks for parallel processing, the following steps are taken:

1. Analysis

The first step is to run a constant rate factor (CRF) encoding pass that varies the bitrate as needed to maintain constant quality. Using a carefully chosen quality target value based on the source file, we are able to capture valuable data about the motion, complexity and scene changes in the content that we can use in the next steps.

2. Encoding 

The information gathered from the CRF pass is scaled so that the overall average bitrate of the output file will respect the target bitrate set by the user. Using new target information, the chunks are encoded. 

3. Optimization

Using data from the previous passes, the encoder now re-allocates bits from less complex segments to more complex ones, wherever possible. This ensures there is no degradation during complex, high-motion scenes and helps clean up any lower quality frames. In the process of redistributing bits, any drastic jumps in bitrate between adjacent chunks are smoothed to avoid causing player buffer issues. 

What are the benefits of Bitmovin’s 3-Pass encoding?

Now that you know how it works, let’s talk about how using 3-Pass encoding benefits your customers, your operations and your bottom line.

Better visual quality

Because of its multi-stage process, 3-Pass encoding makes sure your viewers are getting the best possible quality on the first try. No need for time consuming experimentation and analysis to find the ideal encoding settings. Your content looks the best it can at any bandwidth.

Bitrate and cost control

Other approaches to improving encoding quality usually involve increasing bitrate, which in turn, increases the eventual storage and delivery costs. 3-Pass encoding makes sure the bits are used exactly and only where they are needed, giving the ideal balance of quality and efficiency for lower costs and less buffering.

Scalability and speed 

When using traditional encoding approaches, multi-pass encoding can take a long time for a single file, not to mention large batches of files. With Bitmovin’s 3-Pass, neither is an issue as our split-and-stitch process and cloud-native scalability keep turnaround times to a minimum, even for long form content and unpredictable spikes in demand. 

Quality comparisons

In the graph below, the same file was encoded using 2-Pass and 3-Pass encoding and the VMAF score was measured and plotted for every frame (the vertical axis represents how many frames received that VMAF score). With 3-Pass, represented in blue, you can see that the overall average VMAF score improved a bit compared to 2-pass, shown by the red dots on the lower plots. But there’s another more important and impressive difference between the 2, which is the reduction of lower quality frames that would be noticeable by viewers. The quality of the worst frame was improved by ~20 points and the amount of frames scoring below 80 was cut in half.

You can also see where the 2-pass rendition had several frames scoring in the upper 90s, meaning bits were being allocated to those frames that weren’t detectably improving quality for viewers. 3-pass encoding was able to intelligently redistribute those “extra” bits to frames where they could make a noticeable difference.

3-pass encoding vs 2-pass encoding graph showing far fewer low quality frames with 3-pass encoding
Plot showing the quality improvements and bitrate redistribution of 3-Pass Encoding

3-pass works especially well for content with a mix of different complexity and motion, letting you use lower bitrates to produce equivalent quality compared to other methods. This means your streams are less susceptible to buffering and look better for viewers with limited bandwidth, not to mention the cost savings on storage and CDN that can really add up.

3-pass encoding vs 2-pass encoding for sports content. Side-by-side comparison shows equivalent quality with lower bitrate using 3-pass.
3-Pass Encoding produces equivalent quality to other methods, with lower bitrates

What is the difference between 3-Pass and Per-Title encoding?

If you’re familiar with Bitmovin, you may also know about our Per-Title encoding. You might be wondering, “Isn’t Per-Title also about creating the ideal encoding settings for each video?” If you were, great question!

Per-Title encoding analyzes the source file settings and complexity and determines the ideal adaptive bitrate ladder for that piece of content. The Per-Title algorithm feeds the encoder with the resolution and bitrate pairs that will provide the best quality of experience across the entire range of devices and bandwidth. 

3-Pass is about getting the absolute best quality encoding for a given bitrate and resolution pair. So Per-Title determines the ideal target bitrates and 3-Pass makes sure they look as good as possible. For adaptive bitrate streaming, we highly recommend using 3-Pass and Per-Title together for the best results. 

The graph below is an extreme example, but for this video from a Bitmovin customer, using 3-pass with Per-Title meant encoding an ABR ladder with 4K video at 2 Mbps and lower, compared to their incumbent ABR ladder that was targeting 15 Mbps for 4K content which was mostly wasted for this particular video.

- Bitmovin
Using 3-Pass and Per-Title encoding together produces the best visual quality and streaming experiences.

Try 3-Pass encoding for free

3-Pass and Per-Title encoding are both available to use, for free, with a Bitmovin trial. If you’re obsessed with quality, but are spending too much time and effort finding the best encoding settings, you really need to sign up today and see the results for yourself. 

The post 3-Pass encoding enhances video quality, making every bit count appeared first on Bitmovin.

]]>
“Better Together” at IBC 2024: Elevating Streaming Experiences with Bitmovin Innovators Network https://bitmovin.com/blog/better-together-at-ibc-2024/ Thu, 05 Sep 2024 12:14:37 +0000 https://bitmovin.com/?p=287339 In a rapidly evolving media landscape, the importance of collaboration has never been clearer. At Bitmovin, we have long championed the belief that the best solutions emerge when industry leaders join forces. Our recent NAB 2024 showcase underscored this belief, and as we approach IBC 2024 in Amsterdam, we are excited to highlight how our...

The post “Better Together” at IBC 2024: Elevating Streaming Experiences with Bitmovin Innovators Network appeared first on Bitmovin.

]]>

In a rapidly evolving media landscape, the importance of collaboration has never been clearer. At Bitmovin, we have long championed the belief that the best solutions emerge when industry leaders join forces. Our recent NAB 2024 showcase underscored this belief, and as we approach IBC 2024 in Amsterdam, we are excited to highlight how our partners are leveraging the “Better Together” philosophy to create innovative, impactful solutions.

Driving Success Through Partnership: The Core of Bitmovin Innovators Network

Our “Better Together” approach is rooted in a simple yet powerful idea: collaboration drives innovation. At NAB 2024, this shone through in the way our partner network delivered solutions that not only met but exceeded the needs of our customers. Together, we are tackling key challenges—reducing streaming costs, generating new revenue streams, retaining and growing subscribers.

As we gear up for IBC 2024, these themes remain at the forefront of our collective efforts. Our partners are prepared to showcase how they are pushing the boundaries of what is possible in streaming, ensuring that our customers can deliver exceptional experiences while optimizing their operations.

The Power of Partnership at IBC 2024

Joint Customer Success Stories: A Testament to Collaboration

At IBC 2024, we will highlight the tangible outcomes of our partnerships, showcasing how “Better Together” translates into real-world successes of solving customer challenges. On Thursday, September 12 from 3:30-6:00 PM, we are once again hosting our exclusive Bitmovin Innovators Network Partner Executive Networking Event. We have an exciting lineup of customer success stories planned for this year’s event, including Alpha Networks presenting a “better together” customer success story featuring Ligue Nationale de Volley, Insys Video Technologies highlighting their “better together” success with ORF, and a “Voice of the Customer” session lead by BBC. We will wrap up the afternoon with the inaugural Bitmovin Social Hero Awards, followed by the executive happy hour.

On the Bitmovin Stand: Discovering the Future of Media Search

At IBC 2024, the Bitmovin stand (5.H48) will be a hub of innovation, featuring a dedicated demo station from our partner, Nomad Media. Nomad will be unveiling their new advanced Generative AI search capability that enables business users to find and discover their media that otherwise could never have been found – with the ability to identify locations, people, activities and more. Visit Nomad’s demo station to see this innovative solution in action and learn how it can transform your media workflows.

“Better Together” Solutions on the IBC Floor

The IBC floor will be buzzing with activity as our partners present a range of “Better Together” solutions designed to address some of the most pressing challenges in the streaming industry. Here is a preview of what you can expect:

Accedo, in partnership with Humans Not Robots, is co-leading the ECOFLOW project under the IBC Accelerator Program to measure and reduce the environmental impact of streaming. The initiative, featuring Bitmovin Player’s ECO Mode, collaborates with industry leaders – including BBC, ITV, Bitmovin, RTL Nederland, Quanteec, Cognizant, the Institute of Engineering and Technology (IET), Fraunhofer Fokus, Greening of Streaming, DIMPACT, and the European Broadcasting Union (EBU) – to assess energy consumption across the streaming supply chain, starting with CDNs, encoding, and end-user devices.

Join the presentation with Accedo & ITV on Friday at 11:15 in the Accelerator Zone, and meet with our Product Managers for Playback, James Varndell on Friday 10:30-12:00, and Jacob Arends on Sunday 3:30-5:00, for in-depth discussions.

2Coders [5.H96] will be showcasing Velvet, an SDK-based front-end app, integrating Bitmovin Player and Bitmovin Analytics for optimized, multi-device streaming, delivering high-quality content with cost efficiency and fast time to market.

At Elicium Meeting Room [13.D301], learn about Akamai Connected Cloud, a massively distributed edge computing cloud platform and how Bitmovin Live and VOD Encoding SaaS on Akamai Connected Cloud helps Media & Entertainment customers reduce streaming expense by up to 90% by reducing compute and data transfer costs.

Alpha Networks [1.A59] will showcase a joint solution for live streaming that optimizes costs without compromising video quality, featuring Alpha Networks’ PaaS and SaaS products, Gecko and Bee, and modular video software Tucano, integrated with Bitmovin Live Encoding on Akamai Connected Cloud.

Visit the Amazon Web Services (AWS) stand [5.C90] to learn about Bitmovin’s SaaS products running on the AWS cloud, including Bitmovin Live Encoding, Bitmovin VOD Encoding, and Bitmovin Analytics. Customers can learn how Bitmovin integrates with AWS services, like AWS Elemental MediaPackage, AWS Elemental MediaTailor, Amazon CloudFront, and others, as well as integrations with partner SaaS products, including anti-piracy solutions, content and asset management systems, ad monetization platforms, data visualization products, and more – to solve for every live and on-demand use case. Bitmovin Live Encoding, Bitmovin VOD Encoding, and Bitmovin Analytics are available in the AWS Marketplace.

At Broadpeaks stand [1.F83], learn about transitioning to an ad-supported HVOD (Hybrid VOD) model and enhance your monetization strategy with mid-roll ad management via Broadpeak.io Ad Proxy, premium UX, integrating Bitmovin VOD Encoder with clean transitions for mid-roll ad break, and revenue protection with anti-ad-skipping through Bitmovin Player and Broadpeak Smartlib SDK integration.

Edgio [5.A68] will demonstrate their Smartplay technology, a component of Edgio’s Uplynk Streaming Media Platform, which integrates seamlessly into Bitmovin VOD Encoding SaaS workflows generating new revenue through personalized sessions.

EZDRM [5.A50] is showcasing a cost-effective live video streaming solution where content captured by a Videon Edgecaster appliance is routed to Bitmovin Live Encoder, converted to DASH and secured by EZDRM DRAMaaS in an Akamai Connected Cloud instance.

Learn how PallyCon’s [5.G56] DRM License Cipher and Key Rotation prevent software-level vulnerabilities, like DRM license hijacking. Seamlessly integrated with Bitmovin Player, it ensures robust content protection and secure streaming experiences for global audiences.

MainStreaming [5.H30] will demonstrate its new implementation of Common Media Client Data (CMCD), working with Bitmovin Player, that provides advanced Stream Delivery Routing Decisioning to further enhance Playback Quality of Experience (QoE) and help streamers retain and grow subscribers.

MediaKind [1.D71] demonstrates how to stream flawless video and build iconic sports apps with Bitmovin Player and Bitmovin Analytics as part of an end-to-end solution for D2C streaming and monetization.


NAGRAVISION [1.C81] will showcase its streaming security and consumer engagement solutions including OpenTV Video Platform, integrated with Bitmovin Player for secure, high-quality streaming across multiple devices, ensuring seamless delivery and protection of premium content.

Synamedia [1.B33] is partnering with Bitmovin to showcase several cutting-edge solutions. At the Innovation pod, you will see how content steering, integrated with Bitmovin’s Playback capabilities, works seamlessly with Synamedia’s CDN solutions to optimize content delivery. At the Ad Insertion and Monetization pod, Bitmovin and Synamedia team up for ad insertion with precise HLS interstitials, driving more effective monetization strategies. Finally, at the D2C Streaming pod, you can discover how Bitmovin Player ensures low latency streaming, delivering an unmatched viewer experience for Synamedia’s D2C solutions for sports.

Hosted at the EZDRM stand [5.A50], Videon is demonstrating a live end-to-end secured stream, encrypted from the video source with a Docker container running on the Videon LiveEdge® platform, sent to the Bitmovin Live Encoder running on Akamai Connected Cloud, and distributed over the Akamai Content Delivery Network (CDN), then played back on Bitmovin Player in a cost-effective and scalable fashion.

Discover how Yospace’s [5.C77] dynamic ad insertion solution recently delivered four billion one-to-one addressable ads during Paris 2024. Yospace, with Bitmovin Live Encoding, delivers maximum ad revenues for media owners at scale for the streaming age.

Zixi [5.A85] is showing how customers use the native integration of Zixi with Bitmovin Live Encoding for secure, reliable, and cost-effective ultra-low latency live IP video streaming of sports, news, and events.

Engage, Connect, and Celebrate: Social Activities at IBC


IBC isn’t just about showcasing technology; it’s also about connecting with peers and partners in the industry. We are excited to invite you to a range of social activities designed to foster collaboration and innovation.

Lunch and Learn, Hosted by Akamai: On Saturday, 14 September, from 11:00 AM – 1:00 PM in room G109 at the RAI, Akamai is hosting a lunch and learn session “How distributed cloud is driving innovation in digital media.” Bitmovin’s EVP of Product, Reinhard Grandl, and other industry leaders will be discussing Akamai’s vision for media and share real-world customer success stories. Please register in advance to attend this event.

Bitmovin and Akamai IBC Reception: We at Bitmovin are proud to be partnering with Akamai for their exclusive reception on Saturday, 14 September, from 5:00 PM – 9:00 PM at The Beach at Strandzuid. This invite-only event offers a chance to connect with industry peers, discuss the latest innovations, and enjoy a relaxed evening in a vibrant setting. To secure your invite, please speak to your Bitmovin representative.


Bitmovin & Nomad Media Happy Hour: Nomad Media and Bitmovin are co-hosting a networking reception and happy hour on Friday, 13 September, from 5:00 – 6:00 PM at the Bitmovin Stand, 5.H48. This happy hour is the perfect chance to unwind and connect with fellow attendees. Visit Nomad or Bitmovin for an exclusive invite!

Breakfast with Synamedia: Synamedia and Bitmovin co-host a breakfast with coffee and pastries at the Synamedia Stand, 1.B33, at 10:00 AM. We invite you to meet our teams for casual conversations about the latest industry trends.

Irdeto Happy Hour: On Sunday, 15 September, from 4:00 – 7:00 PM, join us at the Irdeto Stand, 1.D51, for a joint happy hour. It is a great way to wrap up the weekend, reflect on the insights gained at IBC, and win some awesome giveaways.

MainStreaming & Bitmovin Presentation and Happy Hour: Join MainStreaming and Bitmovin for a presentation followed by Happy Hour at MainStreaming’s booth [5.H30] on Saturday, 14 September, from 4:30 – 6:00 PM. Discover how MainStreaming’s CMCD+ and Bitmovin Player improve performance, enhance QoS, and maximize ROI. Hear from Sergio Carulli, CPO at MainStreaming, and Reinhard Grandl, Executive VP of Product at Bitmovin.

As we prepare for IBC 2024, we are reminded of the incredible power of “Better Together”. By working collaboratively, Bitmovin and our partners are driving the streaming industry forward, creating solutions that not only meet the needs of today’s market but also anticipate the challenges of tomorrow. Schedule a meeting with our team to learn more about these and other solutions. 

We can’t wait to see you in Amsterdam and continue building on our joint success stories!

The post “Better Together” at IBC 2024: Elevating Streaming Experiences with Bitmovin Innovators Network appeared first on Bitmovin.

]]>
Providing a Premium Audio Experience in HLS with the Bitmovin Encoder https://bitmovin.com/blog/premium-hls-audio/ https://bitmovin.com/blog/premium-hls-audio/#respond Mon, 01 Jul 2024 14:53:51 +0000 https://bitmovin.com/?p=283109 Introduction Many streaming providers are looking for ways to offer a more premium and high quality experience to their users. One often overlooked component in streaming quality is audio – and more specifically which audio bitrates, channel layouts, and even audio languages are available and how these options can be delivered to the viewers on...

The post Providing a Premium Audio Experience in HLS with the Bitmovin Encoder appeared first on Bitmovin.

]]>
Introduction

Many streaming providers are looking for ways to offer a more premium and high quality experience to their users. One often overlooked component in streaming quality is audio – and more specifically which audio bitrates, channel layouts, and even audio languages are available and how these options can be delivered to the viewers on a range of devices. While there many ways of improving the video streaming quality & experience such as Per-Title Encoding, Multi-Bitrate Video, High Dynamic Range (HDR), and high resolutions, there are also some some great ways of enhancing a user’s experience with premium hls audio. Some of the most important considerations for audio streaming are:

  • Adaptive Streaming: serving multiple audio bitrates for various streaming conditions
  • Reduced Bandwidth & Device Compatibility: multi-codec audio for better compression at reduced bitrates
  • Improved User Experience: 5.1(or greater) surround sound or even lossless audio
  • Accessibility and Localization: such as multi-language or descriptive audio

You can learn even more about how audio encoding affects the streaming experience in this blog.

In Bitmovin’s 2023-24 Video Developer Report, we saw that immersive audio ranked in the top 15 areas for innovation; while audio transcription was the #1 ranked use-case for AI and ML. Furthermore, though AAC remains the the most widely used audio codec – mostly due to it’s wide device support, we see that both Dolby Digital/+ and Dolby Atmos are the #2 and #3 ranked audio codecs that streaming companies are either currently supporting or planning on supporting in the near future.

- Bitmovin
Audio codec usage – source: Bitmovin Video Developer Report

With HLS and its multivariant approach, this is all possible; but understanding just how to construct and organize your HLS multivariant playlist can be tricky at first. In this tutorial we will take a look at some best practices in HLS for serving alternate audio renditions as well as an example at the end of this article showcasing how to simply do this using the Bitmovin Encoder.

Basic audio stream packaging

The most basic way to package audio for HLS is to mux the audio track with each video track. This works for very simple configurations where you are only dealing with outputting a single AAC Stereo audio track at a single given bitrate. While the benefit of this approach is simplicity, it has many limitations such as not being able to support multi-channel surround sound, advanced codecs, and multi-language support. Additionally demuxing audio and video comes with benefit of using other muxing containers like fragmented MP4 or CMAF which don’t require client-side transmuxing. Additionally, keeping audio and video muxed together comes with inefficient storage and delivery as each video variant will have the audio duplicated. Similarly, demuxed audio and video allows for the use MP4 and CMAF containers which are more performant for client devices since they won’t have to demux or transmux the segments real-time.

A multivariant playlist output for this would look something like:

#EXTM3U
#EXT-X-VERSION:3
#EXT-X-INDEPENDENT-SEGMENTS
#EXT-X-STREAM-INF:BANDWIDTH=4255267,AVERAGE-BANDWIDTH=4255267,CODECS="avc1.4d4032,mp4a.40.2",RESOLUTION=2560x1440
manifest_1.m3u8

#EXT-X-STREAM-INF:BANDWIDTH=3062896,AVERAGE-BANDWIDTH=3062896,CODECS="avc1.4d4028,mp4a.40.2",RESOLUTION=1920x1080
manifest_2.m3u8

#EXT-X-STREAM-INF:BANDWIDTH=1591232,AVERAGE-BANDWIDTH=1591232,CODECS="avc1.4d4028,mp4a.40.2",RESOLUTION=1600x900
manifest_3.m3u8

#EXT-X-STREAM-INF:BANDWIDTH=1365632,AVERAGE-BANDWIDTH=1365632,CODECS="avc1.4d401f,mp4a.40.2",RESOLUTION=1280x720
manifest_4.m3u8

#EXT-X-STREAM-INF:BANDWIDTH=862995,AVERAGE-BANDWIDTH=862995,CODECS="avc1.4d401f,mp4a.40.2",RESOLUTION=960x540
manifest_5.m3u8

Audio/Video demuxing

A better approach is to demux the Audio and Video tracks – luckily HLS makes this simple by the use of HLS EXT-X-MEDIA playlists which is the standard way of declaring alternate content renditions for audio, subtitle, closed-captions, or video(mostly used alternative viewing angles such as in live sports). With the use of EXT-X-MEDIA to decouple audio from video, we can add in many great audio features such as supporting alternate/dubbed language tracks, surround sound tracks, multiple audio qualities, and multi-codec audio.

By supplying audio tracks with EXT-X-MEDIA tags, we can explicitly add each audio track that we want to output as well as group them together – Then we can correlate each Video Variant(EXT-X-STREAM-INF) to one of the grouped Audio Media Playlists.

Using the previous example of a single AAC Stereo Audio track, a demuxed audio/video output would look like:

#EXTM3U
#EXT-X-VERSION:3
#EXT-X-INDEPENDENT-SEGMENTS

#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="AAC_Stereo",LANGUAGE="en",NAME="English - Stereo",AUTOSELECT=YES,DEFAULT=YES,URI="audio_aac.m3u8"

#EXT-X-STREAM-INF:...,CODECS="avc1.4d4032,mp4a.40.2",RESOLUTION=2560x1440,AUDIO="AAC_Stereo"
manifest_1.m3u8

#EXT-X-STREAM-INF:...,CODECS="avc1.4d4028,mp4a.40.2",RESOLUTION=1920x1080,AUDIO="AAC_Stereo"
manifest_2.m3u8

#EXT-X-STREAM-INF:...,CODECS="avc1.4d4028,mp4a.40.2",RESOLUTION=1600x900,AUDIO="AAC_Stereo"
manifest_3.m3u8

#EXT-X-STREAM-INF:...,CODECS="avc1.4d401f,mp4a.40.2",RESOLUTION=1280x720,AUDIO="AAC_Stereo"
manifest_4.m3u8

#EXT-X-STREAM-INF:...,CODECS="avc1.4d401f,mp4a.40.2",RESOLUTION=960x540,AUDIO="AAC_Stereo"
manifest_5.m3u8

Here, you can first see we declare a single Audio Media(EXT-X-MEDIA) playlist for our audio track and give it a group-id attribute value of “AAC_Stereo“. Then each Video Variant EXT-X-STREAM-INF tag uses the “AUDIO” attribute to associate its video track to the Audio Media group “AAC_Stereo“.

Multiple audio bitrates

But now let’s imagine we want to better optimize our Adaptive Streaming to deliver our AAC Stereo audio in multiple bitrates such as a high(196kbps) and low(64kbps) so that the higher resolution Video Variants can take advantage of higher quality+bitrate audio given the increase in bandwidth when streaming those variants. We can accomplish this by encoding our audio with both a low and high bitrate outputs and group them separately – then decide which Video Variant gets which Audio bitrate/quality. – For example, our 720p or below variants get the lower quality audio by default, and our full HD or above variants get the higher quality audio by default. Just think of that as defaults though, because most modern Players that stream HLS, will allow for independently picking which audio quality to play based on Adaptive-Bitrate streaming conditions.

An example of utilizing a low and a high AAC Stereo tracks would look like:

#EXTM3U
#EXT-X-VERSION:3
#EXT-X-INDEPENDENT-SEGMENTS

#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="aac-stereo-64",LANGUAGE="en",NAME="English - Stereo",AUTOSELECT=YES,DEFAULT=YES,URI="audio_aac_64k.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="aac-stereo-196",LANGUAGE="en",NAME="English - Stereo",AUTOSELECT=YES,DEFAULT=NO,URI="audio_aac_196k.m3u8"

#EXT-X-STREAM-INF:...,CODECS="avc1.4d4032,mp4a.40.2",RESOLUTION=2560x1440,AUDIO="aac-stereo-196"
manifest_1.m3u8

#EXT-X-STREAM-INF:...,CODECS="avc1.4d4028,mp4a.40.2",RESOLUTION=1920x1080,AUDIO="aac-stereo-196"
manifest_2.m3u8

#EXT-X-STREAM-INF:...,CODECS="avc1.4d4028,mp4a.40.2",RESOLUTION=1600x900,AUDIO="aac-stereo-196"
manifest_3.m3u8

#EXT-X-STREAM-INF:...,CODECS="avc1.4d401f,mp4a.40.2",RESOLUTION=1280x720,AUDIO="aac-stereo-64"
manifest_4.m3u8

#EXT-X-STREAM-INF:...,CODECS="avc1.4d401f,mp4a.40.2",RESOLUTION=960x540,AUDIO="aac-stereo-64"
manifest_5.m3u8

In this example, we now have two audio tracks, one for each bitrate, and therefore have two Audio Media (EXT-X-MEDIA) playlists defined, each having unique GROUP-ID attribute, but the same NAME attribute. This is a good way declaring that the audio tracks are the same language, channel config, and codec, but at different qualities. Now, we can declare that each Video Variant(EXT-X-STREAM-INF) that is 720p or less sets the AUDIO group for that variant to the low bitrate Audio Track(GROUP-ID="aac-stereo-64") and those variants above 720p get the higher bitrate AUDIO group(GROUP-ID="aac-stereo-196") by default (but again, most Players can manage the audio tracks independently for optimal adaptive streaming).

This is at least an improvement on the previous single-bitrate audio packaging – But still, there are plenty of enhancements we can make!

More efficient AAC

The previous examples are all relying on Low Complexity AAC(AAC-LC) because this basic audio codec is supported by every playback device. It is necessary to always have at least one AAC-LC track to be able support older devices. However, most devices these days can support more efficient versions of AAC such as High Efficiency AAC(AAC-HE) which comes in two main versions: v2 which is used for bitrates up to 48kbps and v1 which is used for bitrates up to 96kbps.

So let’s adapt our previous example to not rely on 2 (or more) different AAC-LC audio tracks, and instead output one AAC-HE v1, one AAC-HE v2, and one AAC-LC rendition. The tricky part here is that we will want to group each of the above into a different GROUP-ID so that the Player client can decide which to use based on which codecs it supports – but we also will want each Video Variant to be able to use any of those audio tracks. To accomplish this, all we need to do is duplicate each Video Variant for each of the 3 unique Audio Media GROUP-IDs.

A note on grouping audio renditions

The apple authoring spec recommends creating one audio group for each pair of codec and channel count.

We now have have 3 different versions of the AAC codec so we will have 3 different audio groups.

#EXTM3U
#EXT-X-VERSION:3
#EXT-X-INDEPENDENT-SEGMENTS

#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="aac_lc-stereo-128k",LANGUAGE="en",NAME="English - Stereo",AUTOSELECT=YES,DEFAULT=YES,URI="audio_aaclc_128k.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="aac_he1-stereo-64k",LANGUAGE="en",NAME="English - Stereo",AUTOSELECT=YES,DEFAULT=NO,URI="audio_aache1_64k.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="aac_he2-stereo-32k",LANGUAGE="en",NAME="English - Stereo",AUTOSELECT=YES,DEFAULT=NO,URI="audio_aache2_32k.m3u8"

#EXT-X-STREAM-INF:...,CODECS="avc1.4d4032,mp4a.40.2",RESOLUTION=2560x1440,AUDIO="aac_lc-stereo-128k"
manifest_1.m3u8

#EXT-X-STREAM-INF:...,CODECS="avc1.4d4032,mp4a.40.5",RESOLUTION=2560x1440,AUDIO="aac_he1-stereo-64k"
manifest_1.m3u8

#EXT-X-STREAM-INF:...,CODECS="avc1.4d4032,mp4a.40.29",RESOLUTION=2560x1440,AUDIO="aac_he2-stereo-32k"
manifest_1.m3u8

## Repeat above approach for each additional Video Variant

In this example, you can see that we replicated the 1440p variant 3 times – 1 for reach Audio Media GROUP-ID which would then be repeated for each additional Video Variant. This will allow the client Player to decide for a given Video Variant, which audio track group to use based upon codec support and streaming conditions. Also take note how each Video Variant’s CODECS attribute is updated to represent the necessary audio codec identifier.

Surround sound audio

Now, let’s say we also want to be able to support 5.1 surround sound for those clients which can benefit from it. For this we can decide on which surround sound codec we want to support. Let’s use Dolby Digital AC-3 for this example. Since we are now relying on a more advanced audio codec for optimal surround experience, it is also be important to consider devices that may have 5.1 or greater speaker setups, but that can NOT support Dolby Digital. For this we will also include a secondary 5.1 track using basic AAC-LC codec. Now, we will create 2 new Audio Media playlists with unique GROUP-ID and NAME attributes.

A note on downmixing from 5.1 audio sources

In this example, we will assume the source has a Dolby Digital surround audio track. From that single audio source, we will create create our AC-3 surround track, implicitly convert to our AAC surround track, and automatically downmix the source 5.1 to our various AAC 2.0 Stereo outputs using the Bitmovin Encoder which is shown in sample code at the bottom of this article. Alternatively you can do all sorts of mixing, channel-swapping, as well as work with distinct audio input files like separate files for each channel for example. You can learn more about that here.

Don’t forget about grouping audio renditions

As previously mentioned, the apple authoring spec recommends creating one audio group for each pair of codec and channel count.

We now have have 5 different unique combinations of codecs and channel counts so we will have 5 different audio groups.

#EXTM3U
#EXT-X-VERSION:3
#EXT-X-INDEPENDENT-SEGMENTS

#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="aac_lc-stereo-128k",LANGUAGE="en",NAME="English - Stereo",AUTOSELECT=YES,DEFAULT=YES,URI="audio_aac_128k.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="aac_he1-stereo-64k",LANGUAGE="en",NAME="English - Stereo",AUTOSELECT=YES,DEFAULT=NO,URI="audio_aache1_64k.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="aac_he2-stereo-32k",LANGUAGE="en",NAME="English - Stereo",AUTOSELECT=YES,DEFAULT=NO,URI="audio_aache2_32k.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="aac_lc-5_1-320k",LANGUAGE="en",NAME="English - 5.1",AUTOSELECT=YES,DEFAULT=NO,URI="audio_aac_lc_5_1_320k.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="dolby",LANGUAGE="en",NAME="English - Dolby",CHANNELS="6",URI="audio_dolbydigital.m3u8"

#EXT-X-STREAM-INF:...,CODECS="avc1.4d4032,mp4a.40.2",RESOLUTION=2560x1440,AUDIO="aac_lc-stereo-128k"
manifest_1.m3u8

#EXT-X-STREAM-INF:...,CODECS="avc1.4d4032,mp4a.40.5",RESOLUTION=2560x1440,AUDIO="aac_he1-stereo-64k"
manifest_1.m3u8

#EXT-X-STREAM-INF:...,CODECS="avc1.4d4032,mp4a.40.29",RESOLUTION=2560x1440,AUDIO="aac_he2-stereo-32k"
manifest_1.m3u8

#EXT-X-STREAM-INF:...,CODECS="avc1.4d4032,mp4a.40.29",RESOLUTION=2560x1440,AUDIO="aac_lc-5_1-320k"
manifest_1.m3u8

#EXT-X-STREAM-INF:...,CODECS="avc1.4d4032,ac-3",RESOLUTION=2560x1440,AUDIO="dolby"
manifest_1.m3u8


## Repeat above approach for each additional Video Variant

Here you can see that now we have the 1440p variant replicated a total of 5 times, once for each Audio Media GROUP-ID which allows the client Player to select the most appropriate audio and video track combination.

Again, note how each duplicated Video Variant has an updated CODECS attribute to represent the appropriate audio codec associated to it. One major reason we duplicate each Video Variant for each Audio Media GROUP-ID is that most devices cannot handle switching between audio codec’s during playback; so as Adaptive-Bitrate logic on the Player switches between different Video Variant’s it will pick the variant that has the same audio codec that it has been using. Additionally, in HLS, we cannot simply list the Video Variant once and add all of the various audio codecs to the CODECS attribute. This is because per HLS, the client device MUST be able to support all of the CODECS mentioned on a given Video Variant(EXT-X-STREAM-INF) to avoid possible playback failures. So instead, we separate out the Video Variants per each codec + channel number set.

Multi-language audio

This is all great, but what if I want to support additional dubbed audio language tracks or even Descriptive Audio tracks? Luckily, that is rather simple to do. We can just create additional AudioMedia playlists for each language and utilize the existing GROUP-IDs depending on which codecs and formats we want to support. We can use the existing GROUP-IDs which are logically grouped by Codec and Channel pairing per the Apple authoring spec, then we can add our additional language tracks to those existing groups.

#EXTM3U
#EXT-X-INDEPENDENT-SEGMENTS
#EXT-X-VERSION:6
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="AAC-HE-V1-Stereo",NAME="English-Stereo",LANGUAGE="en",DEFAULT=NO,URI="audio_aache1_stereo.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="AAC-HE-V1-Stereo",NAME="Spanish-Stereo",LANGUAGE="es",DEFAULT=NO,URI="audio_aache1_stereo_es.m3u8"

#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="AAC-HE-V2-Stereo",NAME="English-Stereo",LANGUAGE="en",DEFAULT=NO,URI="audio_aache2_stereo.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="AAC-HE-V2-Stereo",NAME="Spanish-Stereo",LANGUAGE="es",DEFAULT=NO,URI="audio_aache2_stereo_es.m3u8"

#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="AAC-LC-5.1",NAME="English-5.1",LANGUAGE="en",DEFAULT=NO,URI="audio_aaclc-5_1.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="AAC-LC-5.1",NAME="Spanish-5.1",LANGUAGE="es",DEFAULT=NO,URI="audio_aaclc-5_1_es.m3u8"

#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="AAC-LC-Stereo",NAME="English-Stereo",LANGUAGE="en",DEFAULT=NO,URI="audio_aaclc_stereo.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="AAC-LC-Stereo",NAME="Spanish-Stereo",LANGUAGE="es",DEFAULT=NO,URI="audio_aaclc_stereo_es.m3u8"

#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="AC-3-5.1",NAME="English-Dolby",LANGUAGE="en",CHANNELS="6",DEFAULT=NO,URI="dolby-ac3-5_1.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="AC-3-5.1",NAME="Spanish-Dolby",LANGUAGE="es",CHANNELS="6",DEFAULT=NO,URI="dolby-ac3-5_1_es.m3u8"

#EXT-X-STREAM-INF:...,CODECS="avc1.4D401F,ac-3",RESOLUTION=1280x720,AUDIO="AC-3-5.1".0
video_720_3000000.m3u8

#EXT-X-STREAM-INF:...,CODECS="avc1.4D401F,mp4a.40.29",RESOLUTION=1280x720,AUDIO="AAC-HE-V2-Stereo".0
video_720_3000000.m3u8

#EXT-X-STREAM-INF:...,CODECS="avc1.4D401F,mp4a.40.2",RESOLUTION=1280x720,AUDIO="AAC-LC-Stereo".0
video_720_3000000.m3u8

#EXT-X-STREAM-INF:...,CODECS="avc1.4D401F,mp4a.40.2",RESOLUTION=1280x720,AUDIO="AAC-LC-5.1".0
video_720_3000000.m3u8

#EXT-X-STREAM-INF:...,CODECS="avc1.4D401F,mp4a.40.5",RESOLUTION=1280x720,AUDIO="AAC-HE-V1-Stereo".0
video_720_3000000.m3u8

How does this differ from DASH?

In DASH, demuxed Audio and Video tracks are grouped into separate AdaptationSets for a given period. This means each given Video AdaptationSet is not directly linked to one specific Audio track, but rather the client Player independently picks a Video Representation from the Video AdaptationSet and a Audio Representation from the Audio AdaptationSet. So with DASH, we don’t have to worry about re-stating Video tracks for each group of Audio tracks as they are managed independently of each other.

Additional notes

The video codecs you choose to support may also determine which audio codecs and container formats you use. For example if you encode video to VP9 you may want to consider using vorbis or opus audio codecs.

In this example, we used AC-3 for Dolby Digital 5.1, but you may consider using Enhanced AC-3 or more commonly referred to as E-AC-3 for additional channel support(such as 7.1 or more) or spatial audio support like Dolby Atmos. Other premium surround sound codec options are DTS:HD and DTS:X.

Premium HLS audio example with the Bitmovin Encoder & Manifest Generator

Below linked GitHub sample is a pseudo-code example using the Bitmovin Javascript/Typescript SDK that demonstrates outputting multi-bitrate, multi-codec, multi-channel, and multi-language audio tracks. This can greatly enhance user’s experience as it allows for streaming the best quality and most appropriate audio for each device’s codec support and speaker channel configuration.

With the Bitmovin Encoder, we can use one master (Dolby Digital surround in this example) audio file/stream for each language and easily downmix it to 2.0 stereo or implicitly convert it to AAC 5.1. Then, once we simply create each desired audio track, we will use the Bitmovin Manifest Generator to create our HLS multivariant playlists.

Encoding Example For HLS With Multiple Audio Layers

The post Providing a Premium Audio Experience in HLS with the Bitmovin Encoder appeared first on Bitmovin.

]]>
https://bitmovin.com/blog/premium-hls-audio/feed/ 0
New Firefox AV1 support for Encrypted Media Extensions https://bitmovin.com/blog/firefox-av1-support/ https://bitmovin.com/blog/firefox-av1-support/#respond Thu, 30 May 2024 01:12:17 +0000 https://bitmovin.com/?p=281752 This post covers some recent updates, focusing on the new Firefox AV1 support in Encrypted Media Extensions. Bitmovin has been supporting and advocating for use of the AV1 codec for several years, even though there have been gaps in playback support preventing adoption for some workflows. Slowly but surely, those gaps are being filled and the...

The post New Firefox AV1 support for Encrypted Media Extensions appeared first on Bitmovin.

]]>

Table of Contents

This post covers some recent updates, focusing on the new Firefox AV1 support in Encrypted Media Extensions. Bitmovin has been supporting and advocating for use of the AV1 codec for several years, even though there have been gaps in playback support preventing adoption for some workflows. Slowly but surely, those gaps are being filled and the reasons not to use AV1 are going away. Keep reading to learn more.

Firefox 125 adds support for encrypted AV1

A couple of years ago, Bitmovin began testing several different combinations of AV1 encoding, muxing and DRM support across browsers and playback devices. We were somewhat surprised to learn that even though Firefox was the first major browser to support AV1 playback, they had not implemented support for encrypted AV1 as they had for other codecs. We found there was actually an open bug/request filed 5 years ago. 

Shortly after we began watching closely, there was an update…

Screenshot of update to bug report about lack of AV1 Widevine support in Firefox. Since then, Firefox AV1 support has improved with support for encrypted media extensions in version 125.

Ouch. Once the ticket got reassigned, Bitmovin got involved and gave our feedback that for premium/studio content, this support would be needed soon. We also provided a Widevine-protected sample for them to use in testing. Fast-forward to this spring, we saw some action on the ticket and support for AV1 with Encrypted Media Extensions was officially added to Firefox 125!

This means premium content workflows can now use AV1 on all of the major desktop browsers. Apple added support to Safari last fall, including with FairPlay Streaming, but for now it’s limited to devices with AV1 hardware decoders (iPhone 15 Pro, iPad Pro, new Macs with M3 processors).

Previous Bitmovin and Firefox AV1 collaboration

Way back in 2017, before the AV1 spec was finalized, Bitmovin and Firefox collaborated on the first HTML5 AV1 playback. Because the bitstream was still under development and subject to change, Bitmovin and Mozilla agreed on a common codec string to ensure compatibility between the version in the Bitmovin encoder and the decoder in Mozilla Firefox. It was made available in Mozilla’s experimental development version, Firefox Nightly, for users to manually enable. 

Even earlier in 2017, Bitmovin demonstrated the first broadcast quality AV1 live stream at NAB, winning a Best of Show award from Streaming Media Magazine. 

Other recent AV1 playback updates

Android adds dav1d decoder

In March 2024, VideoLAN’s “dav1d” became available to all Android devices running Android 12 or higher. Apps need to opt-in to using AV1 for now, but according to Google, most devices can at least keep up with software decoding of 720p 30fps video. YouTube initially opted to begin using dav1d on devices without a hardware decoder, but may have reverted that decision, likely due to battery concerns on phones. For plug-in Android devices, dav1d is still a great option and a welcome addition to the ecosystem.

iPad Pro gets AV1 playback support with M4 processor

In early May 2024, Apple continued their march toward full AV1 support with the announcement of their new M4 chip, which will power the new iPad Pro. The Media Engine of M4 is the most advanced to come to iPad, supporting several popular video codecs, like H.264, HEVC, and ProRes, in addition to AV1.

Ready to get started with AV1?

Bitmovin has added AV1 codec support to our Per-Title and 3-pass encoding optimizations and made AV1 encoding available in our dashboard UI, so now you can perform your first AV1 encode without any code, API calls, or configuration necessary! Bitmovin’s AV1 encoding has supported DASH streaming together with Widevine content protection for a long time, but we’ve now also added support for fMP4 in HLS playlists together with FairPlay content protection to take advantage of Apple AV1 support for premium content. It’s also available in our free trial, so there’s never been a better time to check it out and begin taking advantage of the bandwidth savings and quality improvements that AV1 can provide.

- Bitmovin

Website: Bitmovin’s AV1 hub   

Blog: State of AV1 Playback Support

Blog: Everything you need to know about Apple’s AV1 Support

Blog: 4K video at SD bitrates with AV1

The post New Firefox AV1 support for Encrypted Media Extensions appeared first on Bitmovin.

]]>
https://bitmovin.com/blog/firefox-av1-support/feed/ 0
The Bitmovin Innovators Network “Better Together” Award Winners! https://bitmovin.com/blog/bitmovin-innovators-network-winners/ https://bitmovin.com/blog/bitmovin-innovators-network-winners/#respond Tue, 14 May 2024 11:48:00 +0000 https://bitmovin.com/?p=281068 The dust has now settled from NAB, and I am still looking back in awe at the success of the Bitmovin Innovators Network and the community that we’ve built, together. A personal highlight for me was our exclusive semi-annual Bitmovin Innovators Network Partner Executive Networking Event which had over 100 attendees who joined to learn...

The post The Bitmovin Innovators Network “Better Together” Award Winners! appeared first on Bitmovin.

]]>
The dust has now settled from NAB, and I am still looking back in awe at the success of the Bitmovin Innovators Network and the community that we’ve built, together. A personal highlight for me was our exclusive semi-annual Bitmovin Innovators Network Partner Executive Networking Event which had over 100 attendees who joined to learn and network. The event included several customer success stories, including Quickplay presenting a “Better Together” customer success story regarding a large Regional Sports Network (RSN); and a fireside chat with OneFootball and Akamai.

We concluded the event with our first annual Bitmovin Innovators Network partner awards to recognize and celebrate the amazing work of our partners who embrace the fact that the industry is “Better Together”, by creating solutions with partners that are designed to simplify customers’ video workload needs and advance the viewing experience for audiences.

I am incredibly proud to share the winners of the Bitmovin Innovators Network partner awards below, and the contributions they’ve made: 

Accenture – Global Systems Integrator of the Year:

Accenture and Bitmovin exemplify the “better together” approach through their close strategic partnership, including an ongoing collaboration with the world’s largest motorsports content owners that led to joint engagements with several of the largest sports and media brands in the world.

Broadpeak – Global ISV Partner of the Year:

Broadpeak embodies the “Better Together” spirit through its unwavering strategic collaboration with Bitmovin. This powerful partnership has yielded several key benefits. Together, they have developed solutions that integrated with Bitmovin’s encoder, player, and analytics, resulting in improved workflows for customers; created a consistent two-way communication between sales teams which has resulted in successful deals with European media brands, and joint marketing and PR initiatives at local events to strengthen their joint brand presence.

MediaKind –  Global Service Provider Partner of the Year:

MediaKind and Bitmovin have developed and maintained a robust strategic partnership that has launched sports applications for world-renowned sports leagues. These applications, including launching an app with a sports league on Apple Vision Pro that garnered rave reviews at the Apple launch event, have significantly boosted market visibility for both brands.

Microsoft Azure Marketplace – Cloud Marketplace of the Year:

Bitmovin has had unprecedented success with the Microsoft Azure Marketplace, including more than 200 new customer wins since June 2023. Azure Marketplace has quickly become Bitmovin’s largest and most successful sales channel.

Nomad Media – Americas Regional Channel Partner of the Year:

Nomad Media has deployed over 30 customers on the Bitmovin Play platform in 2023 alone as part of its Nomad Media platform. Nomad Media has also innovated on the player capabilities with dynamic multi-view capabilities. These advancements were showcased to major US clients, propelling both companies forward. This collaboration not only built a strong pipeline but also significantly boosted brand recognition in the US market.

G&L Geißendörfer & Leschinsky – EMEA Regional Channel Partner of the Year:

G&L is a proactive and committed industry partner, who has worked with Bitmovin on both successful sales and marketing initiatives. The collaboration between the two companies resulted in joint revenue, a new logo, and G&L also exhibited on the Bitmovin stand at IBC 2023 where it highlighted how the two companies’ solutions work together. Bitmovin and G&L also hosted a joint CMCD webinar together, which attracted attendees from key German broadcasters and various telecoms and content providers, and it recently published an e-commerce case study with Home Shopping Europe.

Viet Communications – APAC Regional Channel Partner of the Year

Vietcoms was the first licensee for the Bitmovin Player in the Asia Pacific region. Vietcoms was selected for its hard work and efforts in securing our impressive player business in Vietnam and developing agile operational models to meet the specific customer and TelCo business needs and technical requirements.

Once again, I’d like to give huge congratulations to all the winners. A huge thank you to everyone who attended the Bitmovin Innovators Network Partner Executive Networking Event, and to every single one of our partners who continue to embrace the spirit of “Better Together.” IBC is just around the corner, and we will have some exciting initiatives and announcements coming soon to share with you ahead of the show.

The post The Bitmovin Innovators Network “Better Together” Award Winners! appeared first on Bitmovin.

]]>
https://bitmovin.com/blog/bitmovin-innovators-network-winners/feed/ 0
AI-powered Video Super Resolution and Remastering https://bitmovin.com/blog/ai-video-super-resolution/ https://bitmovin.com/blog/ai-video-super-resolution/#respond Fri, 12 Apr 2024 15:18:37 +0000 https://bitmovin.com/?p=279444 AI has been the hot buzz word in tech the past couple of years and we’re starting to see more and more practical applications for video emerging from the hype, like automatic closed-captioning and language translation, automated descriptions and summaries, and AI video Super Resolution upscaling. Bitmovin has especially focused on how AI can provide...

The post AI-powered Video Super Resolution and Remastering appeared first on Bitmovin.

]]>
AI has been the hot buzz word in tech the past couple of years and we’re starting to see more and more practical applications for video emerging from the hype, like automatic closed-captioning and language translation, automated descriptions and summaries, and AI video Super Resolution upscaling. Bitmovin has especially focused on how AI can provide value for our customers, releasing our AI Analytics Session Interpreter earlier this year and we’re looking closer at several other areas of the end-to-end video workflow.

We’re very proud of how our encoder maintains the visual quality of the source files, while significantly reducing the amount of data used, but now we’re exploring how we can actually improve on the quality of the source file for older and standard definition content. Super Resolution implementations have come a long way in the past few years and have the potential to give older content new life and make it look amazing on Ultra-High Definition screens. Keep reading to learn about Bitmovin’s progress and results. 

What is video Super Resolution and how does it work? 

Super Resolution refers to the process of enhancing the quality or increasing the resolution of an image or video beyond its original resolution. The original methods of upscaling images and video involved upsampling by using mathematical functions like bilinear and bicubic interpolation to predict new data points in between sampled data points. Some techniques used multiple lower-resolution images or video frames to create a composite higher resolution image or frame. Now AI and machine learning (ML) based methods involve training deep neural networks (DNNs) with large libraries of low and high-resolution image pairs. The networks learn to map the differences between the pairs, and after enough training they are able to accurately generate a high-resolution image from a lower-resolution one. 

Bitmovin’s AI video Super Resolution exploration and testing

Super Resolution upscaling is something that Bitmovin has been investigating and testing with customers for several years now. We published a 3-part deep dive back in 2020 that goes into detail about the principles behind Super Resolution, how it can be incorporated into video workflows and the practical applications and results. We won’t fully rehash those posts here, so check them out if you’re interested in the details. But one of the conclusions we came to back then, was that Super Resolution was an especially well-suited application for machine learning techniques. This is even more true now, as GPUs have gotten exponentially more powerful over the past 4 years, while becoming more affordable and accessible as cloud resources. 

graph showing 1000x ai compute improvement in 8 years for NVIDIA GPUs that are used for AI video super resolution
Nvidia’s GPU computation capabilities over the last 8 years – source: Nvidia GTC 2024 keynote 

ATHENA Super Resolution research

Bitmovin’s ATHENA research lab partner has also been looking into various AI video Super Resolution approaches. In a proposed method called DeepStream, they demonstrated how a DNN enhancement-layer could be included with a stream to perform Super Resolution upscaling on playback devices with capable GPUs. The results showed this method could save ~35% bitrate while delivering equivalent quality. See this link for more detail. 

- Bitmovin

Other Super Resolution techniques the ATHENA team has looked at involve upscaling on mobile devices that typically can’t take advantage of DNNs due to lack of processing power and power consumption/battery concerns. Lightweight Super Resolution networks specifically tailored for mobile devices like LiDeR and SR-ABR Net have shown positive early outcomes and performance. 

AI-powered video enhancement with Bitmovin partner Pixop

Bitmovin partner Pixop specializes in AI and ML video enhancement and upscaling. They’re also cloud native and fellow members of NVIDIA’s Inception Startup Program. They offer several AI-powered services and filters including restoration, Super Resolution upscaling, denoising, deinterlacing, film grain and frame rate conversion that automate tedious processes that used to be painstaking and time consuming. We’ve found them to be very complementary to Bitmovin’s VOD Encoding and have begun trials with Bitmovin customers. 

One application we’re exploring is digital remastering of historic content. We’ve been able to take lower resolution, grainy and generally lower quality content (by today’s standards) through Pixop’s upscaling and restoration, with promising results. The encoded output was not only a higher resolution, but also the application of cropping, graining and color correction resulted in a visually more appealing result, allowing our customer to re-monetize their aged content. The image below shows a side-by-side comparison of remastered content with finer details.

- Bitmovin
Side-by-side comparison of AI remastered content

Interested in giving your older content new life with the power of AI video Super Resolution? Get in touch here.

Related Links

Blog: Super Resolution Tech Deep Dive Part 1

Blog: Super Resolution Tech Deep Dive Part 2

Blog: Super Resolution Tech Deep Dive Part 3

Blog: AI Video Research

ATHENA research lab – Super Resolution projects and publications

pixop.com

The post AI-powered Video Super Resolution and Remastering appeared first on Bitmovin.

]]>
https://bitmovin.com/blog/ai-video-super-resolution/feed/ 0
Globo, Google Cloud and Bitmovin: Taking Quality to New Heights https://bitmovin.com/blog/globo-google-cloud/ https://bitmovin.com/blog/globo-google-cloud/#respond Wed, 10 Apr 2024 17:28:53 +0000 https://bitmovin.com/?p=279364 Globo’s content and reach When it comes to content scale and audience reach, Globo is on par with Hollywood and the big US broadcasters with over 3,000 hours of entertainment content being produced each year. The viewership numbers are equally impressive with forty-nine million Brazilians watching the daily, one-hour newscast and Globo’s Digital Hub attracting...

The post Globo, Google Cloud and Bitmovin: Taking Quality to New Heights appeared first on Bitmovin.

]]>
Globo’s content and reach

When it comes to content scale and audience reach, Globo is on par with Hollywood and the big US broadcasters with over 3,000 hours of entertainment content being produced each year. The viewership numbers are equally impressive with forty-nine million Brazilians watching the daily, one-hour newscast and Globo’s Digital Hub attracting eight out of ten Brazilians with internet access. The Digital Hub hosts a variety of content categories, from news, sports, and entertainment to live events such as the Olympics, Carnival, and the FIFA World Cup. Globo also runs a subscription video on demand (SVOD) service called Globoplay that streams live sports, licensed content, as well as movies and television series produced by Estúdios Globo, the largest content production studio in Latin America.

Globo standard of quality

Globo has worked hard to build and become known for the “Globo Standard of Quality”. This included creating the optimal viewing experience together with award-winning content, delivered in stunning visual quality. To develop that reputation, Globo became one of the first mainstream broadcasters outside of the US to offer content in 4K, adopting it as a new standard across its platforms and devices. It has already produced hundreds of hours of 4K content (including HDR) with over a thousand hours of encoding output with its telenovelas and original series. The early adoption of 4K is even more impressive for Globo as Brazil is ranking 79th on the list of countries by Internet connection speed. In order to deliver high-quality video, operators cannot just work with higher bitrates but rather have to find the optimal encoder that achieves both quality, speed, and cost-efficiency at the same time. In the past, 4K encoding was accomplished with on-premises hardware encoders. As the next update cycle of the appliances was fast approaching, Igor Macaubas, Head of Online Video Platform, and Lucas Stephanou, Video Platform Product Owner at Globo, decided to conduct a thorough evaluation of vendors, and ultimately chose Bitmovin.

“We are not willing to compromise the visual integrity of our content and we hold ourselves to strict perception-quality standards. Bitmovin’s renowned 3-Pass Encoding exceeded our expectations and ensures that high perceptual quality can still be delivered while streaming at optimal bandwidth levels.”

– Lucas Stephanou (Video Platform Product Owner, Globo)

Globoplay, powered by Bitmovin VOD Encoding on Google Cloud

Globo handles a massive VOD library of over a million titles, and with 12 variants in their HEVC bitrate stack — encoding demands are high. Bitmovin’s VOD encoding service running on Google Cloud gave Globo the capability to encode a 90-minute video asset in 14 minutes across the entire HEVC ladder. This is a realtime factor of 6.4 times, which resulted in a quantifiable impact on time-to-market. Globo saw the business need for fast turnaround time in encodes and chose Bitmovin as the clear front runner in this regard. 

Bitmovin VOD Encoding on Google Cloud is an easy-to-use, fully-managed video transcoding software-as-a-service (SaaS). Bitmovin VOD Encoding allows customers to efficiently stream any type of on-demand content to any viewing device. Customers use Bitmovin VOD Encoding for a wide range of on-demand streaming use cases, including Subscription Video on Demand (SVOD), Transactional VOD (TVOD), and Ad-supported VOD (AVOD) services, online training, and other use cases. Bitmovin’s Emmy Award® winning multi-codec outputs and per-scene and per-title content-aware transcoding produce higher visual quality video outputs at lower bit rates than other file-based transcoding SaaS to optimize content delivery and reduce streaming cost. Bitmovin VOD Encoding is available for purchase on Google Cloud Marketplace.

Bitmovin’s 3-Pass Encoding algorithm uses machine learning and AI to examine the video on a scene-by-scene basis. It analyzes the content’s complexity multiple times to optimize intra-frame and inter-frame compression. This helps determine the ideal resolution and bitrate combinations that maximize the quality and efficiency. All together, this ensures the visual elements of the video are not degraded in the encoding process and prevents unnecessary overhead data that might impact the viewing experience. 

Processing HD and 4K video with Globo’s volume requires computing resources that would exceed the CapEx budgets of most companies. This is where the Google Cloud’s flexibility and on-demand compute power really shine. Together with Bitmovin’s split-and-stitch technology, single encoding jobs run significantly faster with parallel processing and spikes in demand are handled with ease and throughput that is just not possible with on-premises encoding. Customers also have the option to deploy Bitmovin VOD Encoding as a managed service running in the Bitmovin account or as a single tenancy running in the customer’s Google Cloud account. This allows encoding costs to be applied toward any annual spending commitments.

“Globo is known to set quality standards. We want our viewers to experience our great content in stunning video quality. Our 4K workflows have been relying on hardware encoders, but we wanted to test the power of the cloud and conducted a thorough vendor evaluation based on video quality. Bitmovin’s encoding quality and speed convinced us across the board. And, since using Bitmovin’s encoding service running on Google Cloud, we are spending a fraction of the cost by bringing our capital cost down without spending more on operational cost.”

– Igor Macaubas (Head of Online Video Platform, Globo)

Olympics in 8K

One prime example of this collaboration innovating and pushing the boundaries of video quality is from the Tokyo Olympics in 2021, where 8K VOD content from the Olympics was delivered to viewers at home via Globoplay. This marked the first time that the Olympics were viewable in 8K resolution outside of Japan. 8K video has 16x the resolution of HD and 4x that of 4K, so it requires an enormous amount of processing power and advanced compression to lower the data rates for delivery to end users. 4K and 8K content is also referred to as Ultra High Definition (UHD) and is usually mastered in a High Dynamic Range (HDR) format that allows for brighter highlights, more contrast and a wider color palette. Hybrid-Log Gamma (HLG) is an HDR format that was developed for broadcast applications and backward compatibility with Standard Dynamic Range (SDR) television sets.    

After receiving the HLG mastered content from Intel in Japan, Globo utilized Bitmovin VOD Encoding on Google Cloud’s compute instances for efficient parallel processing with Bitmovin’s VOD Encoding API. 8K/60p transcoding was performed using the High Efficiency Video Coding (HEVC) codec, creating an optimized adaptive bitrate ladder. At this stage, Bitmovin’s 3-pass encoding was key for transforming the content into a compatible size for transport over broadband internet connections, without sacrificing the stunning 8K visual quality. The 8K content was then delivered via Globo’s own Content Delivery Network (CDN) infrastructure to subscribers of Globoplay with 8K Samsung TVs.

- Bitmovin

“Our 3-Pass Encoding proved to be the right encoding mode. It ensured high perceptual quality could still be delivered while streaming at optimal bandwidth levels. With our split-and-stitch technology running on Google Cloud’s scalable infrastructure, we were able to deliver both speed and quality for this time-sensitive content.”

– Stefan Lederer (CEO, Bitmovin)

Learn more about Bitmovin’s VOD Encoding SaaS here.

Related Links

Google Cloud Media & Entertainment Blog

Bitmovin on Google Cloud Marketplace

Globo – Bitmovin Customer Showcase and Case Study

The post Globo, Google Cloud and Bitmovin: Taking Quality to New Heights appeared first on Bitmovin.

]]>
https://bitmovin.com/blog/globo-google-cloud/feed/ 0
Split-and-Stitch Encoding with incredible speed, quality and scale https://bitmovin.com/blog/split-and-stitch-encoding/ https://bitmovin.com/blog/split-and-stitch-encoding/#respond Wed, 13 Mar 2024 17:09:44 +0000 https://bitmovin.com/?p=277993 Introduction In the early days of digital video, encoding a full-length movie could take several hours or even days to complete, depending on the settings and techniques that were used. Over time, as processor speeds increased and specialized hardware was introduced, encoding turnaround times decreased, but it was usually an incremental, linear response to the...

The post Split-and-Stitch Encoding with incredible speed, quality and scale appeared first on Bitmovin.

]]>
Introduction

In the early days of digital video, encoding a full-length movie could take several hours or even days to complete, depending on the settings and techniques that were used. Over time, as processor speeds increased and specialized hardware was introduced, encoding turnaround times decreased, but it was usually an incremental, linear response to the advancements in technology. Once cloud computing resources became readily available and opened new possibilities, cloud-native encoding services like Bitmovin disrupted the status quo with massive gains for encoding speed and turnaround times. This potential was unlocked by developing an innovative new technique known as split-and-stitch encoding. 

What is split-and-stitch encoding? 

As the name suggests, split-and-stitch encoding is a method of encoding that involves splitting a file into smaller chunks, encoding those chunks separately, and then stitching them back together. These smaller chunks being encoded in parallel with separate cloud computing resources led to huge leaps in shortening turnaround times. Prior to that, digital videos were processed linearly, which was an unnecessary limitation carried over from film and tape processing workflows, where the physical medium was actually a limiting factor.  

Bitmovin's split-and-stitch encoding process

How fast is split-and-stitch encoding?

Back in 2015 when Bitmovin first implemented our encoder on the Google Compute Engine (now Google Cloud Platform) we were able to achieve encoding speeds of 66x real-time running in their cloud, as mentioned here. With some further optimization, we became the first to reach 100x real-time encoding speeds. 

The actual turn-around times for your encoding jobs will depend on a lot of factors including source format, codec(s), resolution, duration and advanced features like Dolby Vision, but even with very complex 4K HDR workflows, your encodes will run faster than real-time using split and stitch. Below is a real-world example of an H.264/AAC encoding that ran faster than 92x real-time.

screenshot of bitmovin dashboard showing a Split-and-stitch encoding job ran 92.38 times faster than real-time

Running split-and-stitch encoding in the cloud means your individual encoding jobs run faster than real-time, but it also means that you can scale to run many jobs in parallel which allows large backlogs to be cleared in hours instead of weeks. You also have the capacity to handle spikes of content with no impact on queue time.

What are the advantages of Bitmovin’s split-and-stitch encoding?

Bitmovin has over a decade of experience developing and refining our split-and-stitch implementation. We built our system to take advantage of spot and preemptible instances to keep costs down, while surpassing the quality of single instance encodes with innovations like 3-pass encoding and Smart Chunking.  Our intelligent workload orchestration allows you to manage priority and resource scheduling with capacity for thousands of jobs per hour.

Bitmovin also supports using multiple codecs and packaging formats together with split-and-stitch, including H.264 (AVC), H265 (HEVC), VP9 and AV1 with both HLS and DASH, where other platforms may be limited to H.264 and HLS. We’ve also implemented fast decode enhancements for large J2K and ProRes mezzanine source files that reduce the overall turnaround time even further. 

What is Smart Chunking?

In 2023, Bitmovin made some key changes and updates to our VOD Encoder with a new feature called Smart Chunking. This further increased the potential visual quality and turnaround times that were possible with split and stitch by decoupling the split-and-stitch chunk duration from the user-defined segment duration. This allows for variable chunk size depending on the type of codec and the complexity of encoding, enabling many immediate improvements and future optimizations. Using Smart Chunking means we can segment chunks at the optimal points with better bitrate distribution, providing more consistent quality without any noticeable dips. 


In the graph below, you can see a comparison of an encoding job run with and without Smart Chunking. While the overall quality is similar, in the blue version (without Smart Chunking) there are several lower quality outlier frames. By using Smart Chunking (orange version) the lowest 1% of frames in terms of quality were improved by an average of 6 VMAF points, which is a noticeable difference. The lowest 0.1% improved by 22 VMAF points and the single worst frame gained a massive 60 VMAF points.

- Bitmovin

Is split-and-stitch always the best approach? 

The steps of analyzing, splitting and reassembling chunks of video do add some overhead processing time to the encoding process. For longer episodic content or movies, the added time is negligible compared to the time saved by using split-and-stitch. But, for shorter videos like ads and news clips that are time-sensitive, the pre-processing can make using split-and-stitch less advantageous. 

For these cases, Bitmovin has 2 solutions. First, we’ve added support for hardware encoding with Nvidia T4 GPUs. They can deliver the same quality of video encoding, up to four times faster than CPUs, with H.264 (AVC) and H.265 (HEVC) codec support. We also have a new “accelerated mode” that uses pre-warmed cloud compute resources, so you no longer have to wait for new instances to be started. This has made a huge impact on overall encoding job turnaround time, lowering queuing times from minutes to <10 seconds.

Ready to get started with split-and-stitch encoding?

Bitmovin’s split-and-stitch encoding with Smart Chunking is enabled by default and doesn’t require any special configuration. You can get started quickly with our dashboard encoding wizard without any coding required. Get going today with our free trial and see the results for yourself by clicking here!

The post Split-and-Stitch Encoding with incredible speed, quality and scale appeared first on Bitmovin.

]]>
https://bitmovin.com/blog/split-and-stitch-encoding/feed/ 0
Bitmovin Improves Support AV1 Video Encoding for VoD https://bitmovin.com/blog/bitmovin-improves-av1-video-encoding/ Mon, 19 Feb 2024 01:31:14 +0000 https://bitmovin.com/?p=19474 **Updated in Feb 2024** Since 2017, Bitmovin has actively worked in video and streaming standardization and has consistently driven standards from inception to implementation. Our founders co-created the MPEG-DASH streaming standard used by Netflix, YouTube, and many others, which is responsible for over 50% of peak U.S. internet traffic. Given our encoding, virtualization, and codec...

The post Bitmovin Improves Support AV1 Video Encoding for VoD appeared first on Bitmovin.

]]>
**Updated in Feb 2024**
Since 2017, Bitmovin has actively worked in video and streaming standardization and has consistently driven standards from inception to implementation. Our founders co-created the MPEG-DASH streaming standard used by Netflix, YouTube, and many others, which is responsible for over 50% of peak U.S. internet traffic. Given our encoding, virtualization, and codec expertise, we are excited to work with and contribute to the AV1 codec. As of today, we have doubled down on bringing AV1 to the market and enabling our customers. We have continued to improve our AV1 video encoding technology, and the performance has drastically improved in the last 5 years. In the following, we provide a high-level summary of the features.

The AV1 Video Codec

First things first, what is AV1 and where does it come from? In September 2015 the Alliance for Open Media (AOMedia) was founded by leading companies from various industries with an association with media technology. Among them are browser vendors like Google, Mozilla, and Microsoft, hardware vendors like AMD, ARM, Intel, and NVIDIA, and content providers like Amazon and Netflix. The goal of the AOMedia is to develop an open, royalty-free, next-generation video coding format that is:

  • Interoperable and open
  • Optimized for the Internet
  • Scalable to any modern device at any bandwidth
  • Designed with a low computational footprint and optimized for hardware
  • Capable of consistent, highest-quality, real-time video delivery, and
  • Flexible for both commercial and non-commercial content, including user-generated content.

The new video coding format AOMedia Video 1 (AV1) is meant to replace Google’s VP9 and compete with HEVC/H265 from MPEG. The Alliance is targeting an improvement of about 50% over VP9/HEVC with only reasonable increases in encoding and playback complexity.
When comparing AV1 with HEVC, probably the biggest competitive advantage of AV1 will be that it is royalty-free, especially if we look at the still very uncertain royalty situation with HEVC. Currently, there are two patent pools with MPEG LA and MPEG Advance, plus some unknown HEVC IP owners who have not joined a pool yet. In the end, nobody will know how much you will need to pay in royalties for HEVC. This situation is obviously not satisfactory for the industry and especially, encoding, distribution, content, and hardware companies. (Download the AV1 Datasheet)

Bitmovin and AV1 Video Encoding as of 2024

We have made improvements to the core AV1 encoder in itself and have extensively benchmarked it against multiple practical use cases. The turnaround time and speed of encoding have improved by several orders of magnitude. And in regards to the quality, for the encoder version release v2.110.0, we found that AV1 can offer the same visual quality at 50% less bitrate for H.264/AVC and 30% less bitrate for H.265/HEVC respectively. That is a pretty significant gain!
In addition to the improvements to the core encoder itself, we have integrated AV1 with all the popular features that our customers have come to love. Here is a quick rundown : 

  • Since encoder version 2.104.0, 3-pass encoding with AV1 is generally available. We have found that three-pass AV1 video encoding provides significantly better bitrate distribution compared to the regular 2-pass encoding.
  • Since encoder version 2.109.0, Per-Title encoding with AV1 is available now. Per-Title is one of our biggest competitive advantages. We are proud to offer this also for AV1. 
  • Since encoder version 2.110.0, AV1 video encoding offers three smart presets. This allows customers to choose an optimal tradeoff between the quality and speed of the AV1 encodings. 
  • Since encoder version 2.187.0, AV1 video encoding can be used in HLS playlists, together with FairPlay content protection. This enables support for AV1 playback on compatible Apple devices like the iPhone 15 Pro and new laptops with Apple’s M3 processor.

Also at Bitmovin, we like to keep our promises 😉. We promised seven years ago that we will not stop innovating around AV1 and that we will enable our customers in the best possible way with our AV1 solutions. We are excited to announce that we have kept our end of the bargain. We have developed two patent-pending technologies around AV1. We cannot delve into the details now but just to tease you out, it significantly improves the turnaround times for Per-Title and 3-pass encodings. Keep watching this space for more details about this soon!
And here is the cherry on top of all this. It’s easy to get all this awesome Per-Title ABR encoding together with the AV1 codec and DASH packaging in a SINGLE API call! Yes, it’s not a typo. We said SINGLE. Can you believe that 🤯🤯!? What are you waiting for you? It’s easier than ever to get started with AV1. Try it and reach out to us if you have any questions! We are happy and excited to get you onboard with AV1.

How AV1 Video Encoding Development Works

The AV1 codec has its roots in the codebase of Google’s VP9/VP10 codec with an additional 77 experimental coding tools that have been added and are under consideration. Out of that 77 experimental coding tools, only 8 are currently enabled by default (adapt_scan, ref_mv, filter_7bit, reference_buffer, delte_q, tile_groups, rect_tx, cdef), but the performance of the codec is already appealing. The final goal is to get as many promising coding tools into the final version of the codec and afterward freeze the bitstream specification.
The following procedure explains the high-level process on how experiments can be added to the AV1 codec:

  1. Coding tools are added as experiments into the AV1 codebase. They are controlled at build-time by flags (e.g., –enable-experimental –enable-<experiment-name>).
  2. The hardware team (group of hardware members inside of AOMedia) reviews the experiments to ensure it can be implemented in hardware.
  3. Each experiment needs to pass an IP review to ensure no IPs are violated.
  4. Once reviews are passed the experiment can be enabled by default.

As of today, it is not sure which experiments will make it into the final codec. However, we want to highlight a few that look promising today:

Directional Deringing

It is an effective algorithm for removing ringing artifacts from a coded frame. It plugs in right at the end of the decoding process, so it is easy to integrate. Blocks are searched for an overall direction that is taken into account when applying a conditional replacement filter (CRF) to reduce the risk of blurring and only take obvious ringing patterns into account. It is currently enabled by default.

PVQ (Perceptual Vector Quantization)

This experiment was originally developed for the Daala codec and has the potential to bring a lot of gains, however, it is also quite difficult to integrate into AV1 because PVQ interacts with many other parts of a codec. Compared to the usual scalar quantization, PVQ offers a lot more flexibility to control quantization. It makes techniques like Chroma from Luma or Activity Masking easier. Activity Masking is trying to provide better resolution in low contrast areas. This can be achieved by varying the codebook which is possible with PVQ.

Chroma from Luma (CfL)

CfL is based on a rather simple idea: Take advantage of the fact that edges in the chroma plane are usually well correlated with those in the luma plane. As CfL works entirely in the frequency domain, it can be easily implemented using PVQ. Using PVQ, the chroma coefficients can be predicted from injected luma coefficients. It is a very promising tool as it is quite simple to compute and provides nice benefits with much cleaner colors.

Bitmovin AV1 VoD and Live Encoding

The Bitmovin encoding service now supports AV1 video encoding for VoD and Live. It is possible to encode AV1 with our cloud encoding service. Currently, AV1 video encoding with common encoding tools is a very time-consuming process, as can be seen in the below screenshot taken from a Lenovo T540p notebook with an i7-4800MQ, 8GB RAM running Ubuntu 14.04. It would take 8 hours and 42 minutes to encode a 1080p@24fps 40-second long sequence (Tears of Steel Teaser) with a target bitrate of 1.5Mbps.

Bitmovin encoding AV1

The encoding runs with about 1.93 fpm (frames per minute) which would translate to 0.032 fps (frames per second). If you want to achieve real-time with 24 fps you would need at least 746 times the computing power on a single machine, which is not very practical in a real-world scenario. Clearly, we need another approach to encode with reasonable speeds, especially when it comes to live streaming.
Thanks to our chunk-based encoding approach that allows us to scale a single encoding among multiple instances we can encode AV1 with reasonable turnaround times and it’s also possible to use AV1 for live streams. Our chunked encoding allows us to speed up the encoding almost linearly with the number of instances that are added to the encoding cluster and this approach works with our cloud encoding the same way it works with our on-premise setups that are based on Kubernetes and Docker. Consequently, we can reach the same encoding speeds for AV1 that our customers have come to expect for H264, VP9, and HEVC encoding, which makes the codec effectively usable for media companies and content providers throughout the industry.

How AV1 Video Encoding Works_Workflow_Image
How AV1 Video Encoding Works

We also encoded the ToS teaser with our AV1 encoder in the cloud with the default configuration where we achieved 7 fps, which is about 219 times faster than what was achieved in the test with the Lenovo notebook. This is already pretty impressive however, we were not satisfied with the speed as it was still below real-time. So we tried with an enterprise set-up by just adding more instances to the encoding process. The resulting encoding speed was at 36 fps, which is about 1125 times faster than with the single Lenovo notebook.

AV1 Video Encoding of Tears of Steel_Workflow_Image
Encoding Tears of Steel with AV1 video encoding

In addition, we don’t have to compromise on quality for speed because our encoder does not need to sacrifice quality to reach a certain speed on a single instance as other encoding vendors typically do. With our approach we are not bound to the hardware restrictions of a single instance, we can add more instances to an encoding cluster to generate the quality that our customers have configured in a reasonable time or in real-time for live streams. With our chunk-based implementation of the AV1 video codec, we can encode videos with AV1 even faster than in real-time without compromising quality.

How to implement an AV1 Livestream

In most cases, to run live stream encodings you would need around 4 to 15 Mbps with traditional codecs like H264 to deliver the same quality. So AV1 could reduce your CDN and storage cost by up to 10x.
The setup of our AV1 live workflow that we will showcase consists of the following components:

  • OBS RTMP mezzanine stream, 12Mbps 1080p@30fps
  • Bitmovin Distributed AV1 Cloud Encoder running in Google Cloud receives an RTMP ingest and transcodes to 1.5Mbps 1080p@30fps segmented WebM. Segments will be directly transferred to a Google Cloud Storage bucket.
  • The Bitmovin Distributed AV1 Cloud Encoder also generates HLS and MPEG-DASH manifests that will be transferred to the Google Cloud Storage bucket. Enabled experiments of the AV1 codec are: adapt_scan, ref_mv, filter_7bit, reference_buffer, delte_q, tile_groups, rect_tx, cdef
  • Native playback on a desktop with a Bitmovin Player based on aomdec and ffplay

AV1 live stream screen shots
Our AV1 encoder generates WebM segmented output that could be used with HLS or MPEG-DASH for VoD and Live. However, as AV1 is currently not supported by any browser, we had to write our own player that is able to playback our AV1 live stream. We updated the aomdec application to be able to download and decode the AV1 chunks which can be seen in the left console window. Fortunately, decoding is not as resource intensive as the encoding, which allows you to decode the AV1 stream on normal hardware without special requirements, e.g., the same Lenovo notebook (i7-4800MQ, 8GB RAM running Ubuntu 14.04) that was not capable of encoding this video just near to realtime could easily playback AV1 in software. After the decoding step, we pipe the decoded YUV frames to ffplay to display the stream in a window as you see in the screenshot above. We plan to contribute this functionality back to aomdec after a technical cleanup of the current implementation.

A Practical Quality Comparison

Although the bitstream from AV1 is not finalized yet and much work needs to be done to further improve the quality of the codec, we wanted to get a snapshot of the current state and compare its quality with AVC/H264, HEVC/H265, and VP9. For that purpose, we made two different quality comparisons, the first one with two objective metrics, PSNR and SSIM. PSNR does not always correlate well with perceived quality but is the de-facto standard for video quality comparisons. SSIM is a perception-based quality metric that should give better results in regard to perceived quality.
For the second comparison, we chose to make a side-by-side quality comparison between AV1 and the other codecs. This quality comparison targets a practical use case where the resulting content can be used for Adaptive Bitrate Streaming (ABR). Therefore we have used a fixed Group of Pictures (GOP) size for our experiments and also used Variable Bitrate (VBR) encodings with a target bitrate. This approach is established in the industry but results can vary from scientific evaluations that purely target abstract use cases and theoretical encoder performance through the HM (HEVC reference software) and JM (AVC reference software) reference software that has no practical relevance in the industry.
Let’s first start with the objective quality comparison with PSNR. We encoded the open-source movie Sintel from the Blender Foundation with VBR to the following target bitrates: 100Kbps, 250Kbps, 500Kbps, 1Mbps, 2Mbps, 4Mbps and calculated PSNR and SSIM for the bitrate that has actually been achieved by the individual codec (typical codecs in VBR mode do not hit the target bitrate exactly).
The following encoding settings for the different codecs were used in the Bitmovin Encoding Service:

  • AVC/H264:
    GOP Size: 96 frames (4 seconds), Me_range: 16, Cabac: true, B-Adapt: 2, Me: UMH, Rc-Lookahead: 50, Subme: 8, Trellis: 1, Partitions: All, BFrames: 3, ReferenceFrames: 5, Profile: High, Direct-Pred: Auto
  • HEVC/H265:
    GOP Size: 96 frames (4 seconds), Sao: 1, B-Adapt: 2, CTU: 64, Profile: Main, BFrames: 4, Rc-Lookahead: 25, WeightP: 1, MeRange: 57, Ref: 4, Subme: 3, Tu-Inter-Depth: 1, Me: 3, No-WeightB: 1, Tu-Intra-Depth: 1
  • VP9:
    GOP Size: 96 frames (4 seconds), Cpu-used: 1, Tile-columns: 4, Arnr-Type: Centered, Threads: 4, Arnr-maxframes: 0, Quality: Good, Frame-Parallel: 0, AQ-Mode: none, Arnr-Strength: 3, Tile-Rows: 0
  • AV1:
    Build f3477635d3d44a2448b5298255ee054fa71d7ad9, Enabled experiments by default: adapt_scan, ref_mv, filter_7bit, reference_buffer, delte_q, tile_groups, rect_tx, cdef
    Passes: 1, Quality: Good, Threads: 1, Cpu-used: 1, KeyFrame-Mode: Auto, Lag-In-Frames: 25, End-Usage: VBR

PSNR comparison graph - AV1, VP9, HEVC, H264
The above diagram clearly shows that AV1 already outperforms all the other codecs for each bitrate setting. For bitrates from 1Mbps and higher the quality difference is already pretty big (> 0.5db which is usually clearly visible). VP9 and HEVC/H265 are very similar from a PSNR perspective, however, VP9 was the codec that overshot the target bitrate by far the most.
SSIM comparison graph - AV1, VP9, HEVC, H264
We also compared the four codecs with SSIM. The results can be seen in the above diagram and are quite similar to PSNR with some slight differences. AV1 is still the best performing codec over all bitrates, and AVC/H264 lags behind. However, interestingly AVC/H264 catches up with increased bitrate. An explanation for that could be that in the higher bitrates we can reach nearly the quality of the source material with all codecs, which results in only minor differences between the codecs.
Additionally, we created several side-by-side quality comparisons where we experimentally changed the target bitrate for each codec to reach an average of 500 Kbps. Below you can see the quality comparisons between the encodings comparing the quality of Bitmovin AV1 video encoding with AVC/H264, HEVC/H265, and VP9. We used the well-known Tears of Steel teaser that is 40 seconds long with a 1080p resolution for the comparison, selecting a complex scene that is hard to encode.
AV1 vs H264 side to side comparison
When comparing AV1 video encoding with AVC/H264 the quality difference is very obvious as expected. We can clearly see multiple encoding artifacts and blocking in the right part of the image that has been encoded with AVC/H264. In contrast, the left part with AV1 Video Encoding looks much cleaner without obvious encoding artifacts.
AV1 vs VP9 side to side comparison
Looking at the quality difference between AV1 and VP9 it is not as obvious as with AVC/H264, but still quite visible. Especially the borders of the tiles of the sphere show encoding artifacts and the overall picture in VP9 seems to have quite some noise. We can also identify some blocking artifacts that are not visible in AV1.
AV1 vs HEVC side to side comparison
HEVC/H265 visually looks a bit better than VP9, however, it still has visible encoding artifacts, especially in the lower part of the image and around the arm of the guy with the red coat. When we look closely at the arm we can see that the color is not encoded as nicely as with AV1 and shows some noise.

Conclusion

Bitmovin’s culture and vision have always been to be a technology leader and our passion for video means we consistently tackle the most complex video problems. Why? Because it’s fun and challenging and our team loves a challenge!
Besides that, there are already use cases for an AV1 video encoding where you could use it as your mezzanine format to preserve a high-quality version of your video at a low bit rate that can be used to create your adaptive bitrate renditions or other formats. Using AV1 for that use case would decrease your storage footprint and speed up transfer times inside of your data center or for upload to the cloud.
Furthermore, with the companies behind AOMedia, like AMD, ARM, Intel, NVIDIA, Google, Microsoft, Mozilla, Netflix, and Amazon, it should not take too long to get broad support for AV1. AMD, Intel, and NVIDIA cover the desktop market quite nicely, and ARM and Intel the mobile market. Additionally, the major browser vendors, Google, Microsoft, and Mozilla will make sure that the codec finds its way into the browsers soon after the bitstream freeze. Google, Netflix, and Amazon will make sure that AV1 content will be available quickly and that will further drive adoption and hardware support.
AV1 is the next generation video codec and it’s on track to deliver a 30% improvement over VP9 & HEVC – Learn More

More AV1 Resources:

The post Bitmovin Improves Support AV1 Video Encoding for VoD appeared first on Bitmovin.

]]>