Andy Francis – Bitmovin https://bitmovin.com Bitmovin provides adaptive streaming infrastructure for video publishers and integrators. Fastest cloud encoding and HTML5 Player. Play Video Anywhere. Mon, 02 Dec 2024 01:25:03 +0000 en-GB hourly 1 https://bitmovin.com/wp-content/uploads/2023/11/bitmovin_favicon.svg Andy Francis – Bitmovin https://bitmovin.com 32 32 Multiview HEVC (MV-HEVC): Powering spatial video experiences and more https://bitmovin.com/blog/mv-hevc-encoding/ Mon, 02 Dec 2024 01:24:58 +0000 https://bitmovin.com/?p=293792 The world of video technology is constantly evolving, and one of the more interesting developments in recent years is the story of MV-HEVC (Multiview High Efficiency Video Coding). Even though it was added to the HEVC specification in 2014, MV-HEVC didn’t see much commercial use for almost a decade.  That changed when Apple launched the...

The post Multiview HEVC (MV-HEVC): Powering spatial video experiences and more appeared first on Bitmovin.

]]>
The world of video technology is constantly evolving, and one of the more interesting developments in recent years is the story of MV-HEVC (Multiview High Efficiency Video Coding). Even though it was added to the HEVC specification in 2014, MV-HEVC didn’t see much commercial use for almost a decade. 

That changed when Apple launched the Apple Vision Pro, announcing that unlike Meta Quest and other headsets, their new device would take advantage of MV-HEVC for immersive video experiences. In this blog post, we’ll explore what MV-HEVC is, its potential for enhancing streaming experiences and how to get started. 

What is MV-HEVC?

MV-HEVC stands for Multiview High Efficiency Video Coding, an extension of HEVC that was added to the second edition of the standard in 2014. It’s designed to support the efficient encoding of multiview video content captured from multiple viewpoints, often to create stereoscopic (3D) effects or spatial video experiences for virtual reality (VR) and augmented reality (AR). 

Doubling the encoding and bandwidth requirements for multiple viewpoints could potentially create buffering and playback issues, but MV-HEVC enables the efficient compression and storage of stereoscopic content, reducing the bandwidth required for streaming or the file size needed for storage without compromising the video’s quality.

In short, MV-HEVC allows the encoding of multiple views of the same scene in a way that preserves video quality while keeping the bitrates manageable. This makes it a good fit for 3D, AR and VR applications that require a lot of real-time data processing. 

How MV-HEVC works

Before getting into how MV-HEVC works, let’s take a quick step back to the basics of video encoding. Temporal compression is a technique for reducing file size that is common to all major video codecs. Unless there is a scene change, individual frames of video are usually not that different from one frame to the next. Temporal compression exploits that fact and reuses data where it can, saving some bits from being encoded and shrinking the file size. 

This is done by encoding different types of frames that require less data to reconstruct for playback. I-frames are fully encoded frames that serve as anchor points, while P-frames (Predictive frames) can reuse data from frames that came before them. B-frames (Bi-direcional predictive frames) can reuse data from frames both before and after them. If you’re interested in learning more about some of the fundamentals of video encoding, check out this guide

I touched on all of that because a key benefit of MV-HEVC is that it is also able to take advantage of the commonalities across multiple camera angles or views. In the cases of immersive and 3D videos that are created with different views for the right and left eye, the similar viewpoints usually mean there’s a lot of potential for compression, creating smaller, more manageable files for streaming and storage.

mv-hevc - Bitmovin
Example multiview prediction structure, with cross references between views – Image source: Fraunhofer HHI

Applications of MV-HEVC

Stereoscopic Video (3D Video)

MV-HEVC is particularly useful in the realm of 3D video or stereoscopic content, where two slightly different views (one for each eye) create the stereoscopic effect. By encoding both the left eye and right eye views efficiently in a single stream, MV-HEVC reduces the file size and bitrate compared to other methods. This is crucial for streaming applications like 3D movies or immersive VR experiences where quality and efficiency are key. Other codecs can be used for 3D stereoscopic video as we cover in this blog, but MV-HEVC is more efficient. 

Screenshot of a stereoscopic video frame where the left eye and right eye have distinct views, something supported by MV-HEVC
Top-Bottom Stereoscopic Format source: Blender Foundation

Spatial Video

Another application of MV-HEVC is in spatial video, which is typically used for virtual reality (VR) or augmented reality (AR) content. The Apple Vision Pro is built around the idea of capturing and presenting spatial video, allowing users to immerse themselves in a three-dimensional representation of a scene, combining video and depth information. MV-HEVC support is essential for these types of experiences, reducing massive bitrates of the raw files into something manageable for streaming and real-time immersive experiences. 

mv-hevc - Bitmovin
Side-by-side lenses on the iPhone 15 Pro and iPhone 16 allow for native capturing and recording of MV-HEVC spatial video

Multiview Video

MV-HEVC is also important for multiview video, where multiple views of the same scene are captured from different angles. This could be used in sports broadcasts, where different camera angles are encoded into a single video stream, or for applications that allow users to choose their viewing angle interactively. Depending on your exact use case, this may require multiple decoders or extra processing power that might not be available on all platforms. 

mv-hevc - Bitmovin
Example multiview player, now supported by Bitmovin on some platforms

Dolby Vision with MV-HEVC

MV-HEVC is now also compatible with Dolby Vision, a popular High Dynamic Range (HDR) video format that helps ensure content looks as realistic and as true to the creator’s vision as possible. Most of the top-tier premium streaming content these days is being made available in Dolby Vision format, so it makes sense that companies investing in MV-HEVC production pipelines would want to take advantage of Dolby Vision. Dolby Vision Profile 20 extends the potential quality enhancements of Dolby Vision to MV-HEVC and immersive content. 

Apple Vision Pro and beyond

The Apple Vision Pro is pushing the boundaries of immersive media and while they didn’t create the VR headset segment, Apple definitely put their stamp on it. There are several examples over the years of Apple’s influence on the media technology industry, from their decision to not support Flash video to their decision to (finally) support AV1. 

It seems only likely there will be a halo effect for MV-HEVC around the Vision Pro. One early example is the Blackmagic URSA Cine Immersive camera. I expect in 2025 we’ll see more companies venturing into MV-HEVC support from capture to post-production to distribution. 

mv-hevc - Bitmovin

MV-HEVC video tools

Direct recording with Apple Vision Pro and iPhone

You can record spatial video using MV-HEVC directly on the Apple Vision Pro, iPhone 15 Pro and all iPhone 16 models. The distance between the 2 camera lenses on the Vision Pro seems to provide better results with more depth compared to spatial videos captured on iPhone.

Apple AVFoundation support

Apple also added support to their AVFoundation APIs for converting side-by-side 3D videos into MV-HEVC and spatial videos. You can find more information in their developer documentation here.

Bitmovin VOD encoding beta

Bitmovin’s VOD Encoding now supports MV-HEVC as part of a private beta. If you’re interested in adding MV-HEVC to your transcoding workflows, we’d love to discuss the details with you. You can reply in the Bitmovin Community, comment on this post or get in touch with your Bitmovin contact for more info. 

Conclusion

Thanks in large part to Apple, MV-HEVC is poised to become a key technology in the future of immersive and multiview content. Its ability to efficiently encode multiple views of the same scene, reduce the data required, and maintain high video quality makes it an essential tool for everything from stereoscopic 3D movies to virtual reality experiences on devices like the Apple Vision Pro.

On their other platforms, Apple seems to have signalled a shift toward using the AV1 codec, but AV1 does not currently have multiview support. It will be interesting to see how that situation evolves both within Apple’s products and the wider video ecosystem. While the only certainty is that things will change, unless Apple abandons the Vision Pro, MV-HEVC is likely to be part of the picture for the foreseeable future.

The post Multiview HEVC (MV-HEVC): Powering spatial video experiences and more appeared first on Bitmovin.

]]>
3-Pass encoding enhances video quality, making every bit count https://bitmovin.com/blog/3-pass-encoding/ Fri, 27 Sep 2024 03:33:05 +0000 https://bitmovin.com/?p=288152 Introduction Bitmovin’s VOD Encoder is known for its quality, speed, and cloud-native ability to scale quickly and resiliently. Advanced features like split-and-stitch encoding with Smart Chunking,  Per-Title and 3-Pass encoding set it apart from other encoders on the market, in terms of both visual quality and bitrate efficiency. For our customers, it means lower costs...

The post 3-Pass encoding enhances video quality, making every bit count appeared first on Bitmovin.

]]>
Introduction

Bitmovin’s VOD Encoder is known for its quality, speed, and cloud-native ability to scale quickly and resiliently. Advanced features like split-and-stitch encoding with Smart ChunkingPer-Title and 3-Pass encoding set it apart from other encoders on the market, in terms of both visual quality and bitrate efficiency. For our customers, it means lower costs for storing and delivering video, along with a better experience for their viewers. 

In this post, we’ll explain how Bitmovin’s 3-Pass encoding works and show the benefits of using 3-Pass encoding with Bitmovin. 

How does 3-Pass encoding work?

As you might have guessed, with 3-Pass encoding the analysis and encoding optimization happen in 3 phases. After our split-and-stitch algorithm (now with Smart Chunking) divides the source file into separate chunks for parallel processing, the following steps are taken:

1. Analysis

The first step is to run a constant rate factor (CRF) encoding pass that varies the bitrate as needed to maintain constant quality. Using a carefully chosen quality target value based on the source file, we are able to capture valuable data about the motion, complexity and scene changes in the content that we can use in the next steps.

2. Encoding 

The information gathered from the CRF pass is scaled so that the overall average bitrate of the output file will respect the target bitrate set by the user. Using new target information, the chunks are encoded. 

3. Optimization

Using data from the previous passes, the encoder now re-allocates bits from less complex segments to more complex ones, wherever possible. This ensures there is no degradation during complex, high-motion scenes and helps clean up any lower quality frames. In the process of redistributing bits, any drastic jumps in bitrate between adjacent chunks are smoothed to avoid causing player buffer issues. 

What are the benefits of Bitmovin’s 3-Pass encoding?

Now that you know how it works, let’s talk about how using 3-Pass encoding benefits your customers, your operations and your bottom line.

Better visual quality

Because of its multi-stage process, 3-Pass encoding makes sure your viewers are getting the best possible quality on the first try. No need for time consuming experimentation and analysis to find the ideal encoding settings. Your content looks the best it can at any bandwidth.

Bitrate and cost control

Other approaches to improving encoding quality usually involve increasing bitrate, which in turn, increases the eventual storage and delivery costs. 3-Pass encoding makes sure the bits are used exactly and only where they are needed, giving the ideal balance of quality and efficiency for lower costs and less buffering.

Scalability and speed 

When using traditional encoding approaches, multi-pass encoding can take a long time for a single file, not to mention large batches of files. With Bitmovin’s 3-Pass, neither is an issue as our split-and-stitch process and cloud-native scalability keep turnaround times to a minimum, even for long form content and unpredictable spikes in demand. 

Quality comparisons

In the graph below, the same file was encoded using 2-Pass and 3-Pass encoding and the VMAF score was measured and plotted for every frame (the vertical axis represents how many frames received that VMAF score). With 3-Pass, represented in blue, you can see that the overall average VMAF score improved a bit compared to 2-pass, shown by the red dots on the lower plots. But there’s another more important and impressive difference between the 2, which is the reduction of lower quality frames that would be noticeable by viewers. The quality of the worst frame was improved by ~20 points and the amount of frames scoring below 80 was cut in half.

You can also see where the 2-pass rendition had several frames scoring in the upper 90s, meaning bits were being allocated to those frames that weren’t detectably improving quality for viewers. 3-pass encoding was able to intelligently redistribute those “extra” bits to frames where they could make a noticeable difference.

3-pass encoding vs 2-pass encoding graph showing far fewer low quality frames with 3-pass encoding
Plot showing the quality improvements and bitrate redistribution of 3-Pass Encoding

3-pass works especially well for content with a mix of different complexity and motion, letting you use lower bitrates to produce equivalent quality compared to other methods. This means your streams are less susceptible to buffering and look better for viewers with limited bandwidth, not to mention the cost savings on storage and CDN that can really add up.

3-pass encoding vs 2-pass encoding for sports content. Side-by-side comparison shows equivalent quality with lower bitrate using 3-pass.
3-Pass Encoding produces equivalent quality to other methods, with lower bitrates

What is the difference between 3-Pass and Per-Title encoding?

If you’re familiar with Bitmovin, you may also know about our Per-Title encoding. You might be wondering, “Isn’t Per-Title also about creating the ideal encoding settings for each video?” If you were, great question!

Per-Title encoding analyzes the source file settings and complexity and determines the ideal adaptive bitrate ladder for that piece of content. The Per-Title algorithm feeds the encoder with the resolution and bitrate pairs that will provide the best quality of experience across the entire range of devices and bandwidth. 

3-Pass is about getting the absolute best quality encoding for a given bitrate and resolution pair. So Per-Title determines the ideal target bitrates and 3-Pass makes sure they look as good as possible. For adaptive bitrate streaming, we highly recommend using 3-Pass and Per-Title together for the best results. 

The graph below is an extreme example, but for this video from a Bitmovin customer, using 3-pass with Per-Title meant encoding an ABR ladder with 4K video at 2 Mbps and lower, compared to their incumbent ABR ladder that was targeting 15 Mbps for 4K content which was mostly wasted for this particular video.

mv-hevc - Bitmovin
Using 3-Pass and Per-Title encoding together produces the best visual quality and streaming experiences.

Try 3-Pass encoding for free

3-Pass and Per-Title encoding are both available to use, for free, with a Bitmovin trial. If you’re obsessed with quality, but are spending too much time and effort finding the best encoding settings, you really need to sign up today and see the results for yourself. 

The post 3-Pass encoding enhances video quality, making every bit count appeared first on Bitmovin.

]]>
ATHENA’s first 5 years of research and innovation https://bitmovin.com/blog/5-years-of-research-and-innovation/ https://bitmovin.com/blog/5-years-of-research-and-innovation/#respond Mon, 19 Aug 2024 02:17:57 +0000 https://bitmovin.com/?p=286045 Since forming in October 2019, the Christian Doppler Laboratory ATHENA at Universität Klagenfurt, run by Bitmovin co-founder Dr. Christian Timmerer, has been advancing research and innovation for adaptive bitrate (ABR) streaming technologies. Over the past five years, the lab has addressed critical challenges in video streaming from encoding and delivery to playback and end-to-end quality...

The post ATHENA’s first 5 years of research and innovation appeared first on Bitmovin.

]]>
Since forming in October 2019, the Christian Doppler Laboratory ATHENA at Universität Klagenfurt, run by Bitmovin co-founder Dr. Christian Timmerer, has been advancing research and innovation for adaptive bitrate (ABR) streaming technologies. Over the past five years, the lab has addressed critical challenges in video streaming from encoding and delivery to playback and end-to-end quality of experience. They are breaking new ground using edge computing, machine learning, neural networks and generative AI for video applications, contributing significantly to both academic knowledge and industry applications as Bitmovin’s research partner. 

In this blog, we’ll take a look at the highlights of the ATHENA lab’s work over the past five years and its impact on the future of the streaming industry.

Publications

ATHENA has made its mark with high-impact publications on the topics of multimedia, signal processing, and computer networks. Their research has been featured in prestigious journals such as IEEE Communications Surveys & Tutorials and IEEE Transactions on Multimedia. With 94 papers published or accepted by the time of the 5-year evaluation, the lab has established itself as a leader in video streaming research.

ATHENA also contributed to reproducibility in research. Their open source tools Video Complexity Analyzer and LLL-CAdViSE have already been used by Bitmovin and others in the industry. Their open, multi-codec UHD dataset enables research and development of multi-codec playback solutions for 8K video.  

ATHENA has also looked at applications of AI in video coding and streaming, something that will become more of a focus over the next two years. You can read more about ATHENA’s AI video research in this blog post

Patents

But it’s not all just theoretical research. The ATHENA lab has successfully translated its findings into practical solutions, filing 16 invention disclosures and 13 patent applications. As of publication, 6 patents have been granted:

mv-hevc - Bitmovin
Workflow diagram for Fast Multi-Rate Encoding using convolutional neural networks. More detail available here.

PhDs

ATHENA has also made an educational impact, successfully guiding the inaugural cohort of seven PhD students to their successful dissertation defenses, with research topics ranging from edge computing in video streaming to machine learning applications in video coding. 

There are also two postdoctoral scholars in the lab who have made significant contributions and progress.

Practical applications with Bitmovin

As Bitmovin’s academic partner, ATHENA plays a critical role in developing and enhancing technologies that can differentiate our streaming solutions. As ATHENA’s company partner, Bitmovin helps guide and test practical applications of the research, with regular check-ins for in-depth discussions about new innovations and potential technology transfers. The collaboration has resulted in several advancements over the years, including recent projects like CAdViSE and WISH ABR. 

CAdViSE

CAdViSE (Cloud based Adaptive Video Streaming Evaluation) is a framework for automated testing of media players. It allows you to test how different players and ABR configurations perform and react to fluctuations in different network parameters. Bitmovin is using CAdViSE to evaluate the performance of different custom ABR algorithms. The code is available in this github repo

WISH ABR

WISH stands for Weighted Sum model for HTTP Adaptive Streaming and it allows for customization of ABR logic for different devices and applications. WISH’s logic is based on a model that weighs bandwidth, buffer and quality costs for playing back a segment. By setting weights for the importance of those metrics, you create a custom ABR algorithm, optimized for your content and use case. You can learn more about WISH ABR in this blog post

Visual illustration of WISH ABR research and innovation from ATHENA and Bitmovin.
Decision process for WISH ABR, weighing data/bandwidth cost, buffer cost, and quality cost of each segment.

Project spinoffs

The success of ATHENA has led to three spinoff projects:. 

APOLLO

APOLLO is funded by the Austrian Research Promotion Agency FFG and is a cooperative project between Bitmovin and Alpen-Adria-Universität Klagenfurt. The main objective of APOLLO is to research and develop an intelligent video platform for HTTP adaptive streaming which provides distribution of video transcoding across large and small-scale computing environments, using AI and ML techniques for the actual video distribution.

GAIA

GAIA is also funded by the Austrian Research Promotion Agency FFG and is a cooperative project between Bitmovin and Alpen-Adria-Universität Klagenfurt. The GAIA project researches and develops a climate-friendly adaptive video streaming platform that provides complete energy awareness and accountability along the entire delivery chain. It also aims to reduce energy consumption and GHG emissions through advanced analytics and optimizations on all phases of the video delivery chain.

SPIRIT

SPIRIT (Scalable Platform for Innovations on Real-time Immersive Telepresence) is an EU Horizon Europe-funded innovation action. It brings together cutting-edge companies and universities in the field of telepresence applications with advanced and complementary expertise in extended reality (XR) and multimedia communications. SPIRIT’s mission is to create Europe’s first multisite and interconnected framework capable of supporting a wide range of application features in collaborative telepresence.

What’s next

Over the next two years, the ATHENA project will focus on advancing deep neural network and AI-driven techniques for image and video coding. This work will include making video coding more energy- and cost-efficient, exploring immersive formats like volumetric video and holography, and enhancing QoE while being mindful of energy use. Other focus areas include AI-powered, energy-efficient live video streaming and generative AI applications for adaptive streaming. 

Get in touch or let us know in the comments if you’d like to learn more about Bitmovin and ATHENA’s research and innovation, AI or sustainability related projects. 

The post ATHENA’s first 5 years of research and innovation appeared first on Bitmovin.

]]>
https://bitmovin.com/blog/5-years-of-research-and-innovation/feed/ 0
WWDC 2024 HLS Updates for Video Developers https://bitmovin.com/blog/hls-updates-wwdc-2024/ https://bitmovin.com/blog/hls-updates-wwdc-2024/#respond Mon, 24 Jun 2024 01:14:26 +0000 https://bitmovin.com/?p=282616 Apple’s Worldwide Developer Conference is an annual event used to showcase new software and technologies in the Apple ecosystem. It was created with developers in mind, but sometimes new hardware and devices are announced and its keynote presentations have become must-see events for a much wider audience. There is also usually news about changes and...

The post WWDC 2024 HLS Updates for Video Developers appeared first on Bitmovin.

]]>

Apple’s Worldwide Developer Conference is an annual event used to showcase new software and technologies in the Apple ecosystem. It was created with developers in mind, but sometimes new hardware and devices are announced and its keynote presentations have become must-see events for a much wider audience. There is also usually news about changes and additions to the HTTP Live Streaming (HLS) spec and associated video playback APIs. These HLS updates are often necessary to support new features and capabilities of the announced OS and hardware updates. This post will expand on Apple’s “What’s new in HTTP Live Streaming” document, with additional context for the latest developments that content creators, developers, and streaming services should be aware of.

The lastest HLS updates for 2024

The first draft of the HLS spec (draft-pantos-http-live-streaming) was posted in 2009, then superseded by RFC 8216 in 2017. There are usually draft updates published once or twice per year with significant updates and enhancements. A draft proposal was shared on June 7, that details proposed changes to the spec to be added later this year. Let’s look at some of the highlights below. 

Updated Interstitial attributes

In May 2021, Apple introduced HLS Interstitials to make it easier to create and deliver interstitial content like branding bumpers and mid-roll ads. Now, new attributes have been introduced for Interstitial EXT-X-DATERANGE tags, aimed at enhancing viewer experience and operational flexibility. 

  1. X-CONTENT-MAY-VARY: This attribute provides a hint regarding coordinated playback across multiple players. It can be set to “YES” or “NO”, indicating whether all players receive the same interstitial content or not. If X-CONTENT-MAY-VARY is missing, it will be considered to have a value of “YES”.
  1. X-TIMELINE-OCCUPIES: Determines if the interstitial should appear as a single point “POINT” or a range “RANGE” on the playback timeline. If X-TIMELINE-OCCUPIES is missing, it will be considered to have a value of “POINT”. “RANGE” is expected to be used for ads in live content.
  1. X-TIMELINE-STYLE: Specifies the presentation style of the interstitial—either as a “HIGHLIGHT” separate from the content or as “PRIMARY”, integrated with the main media. If X-TIMELINE-STYLE is missing, it is considered to have a value of “HIGHLIGHT”. The “PRIMARY” value is expected to be used for content like ratings bumpers and post-roll dub cards. 

More detail is available in the WWDC Session “Enhance ad experiences with HLS interstitials“.

Example video timeline using new HLS Interstitials attributes, part of HLS updates from WWDC 2024.
Example timeline for using HLS Interstitials with new RANGE attribute – source: WWDC 2024

Signal enhancements for High Dynamic Range (HDR) and timed metadata

HDR10+

Previously, the specification had not defined how to signal HDR10+ content in a multi-variant HLS playlist. Now you can use the SUPPLEMENTAL-CODECS attribute with the appropriate format, followed by a slash and then the brand (‘cdm4’ for HDR10+). The example Apple provided shows the expected syntax: SUPPLEMENTAL-CODECS=”hvc1.2.20000000.L123.B0/cdm4″. For a long time, HDR10+ was only supported on Samsung and some Panasonic TVs, but in recent years it has been added by other TV brands and dedicated streaming devices like Apple TV 4K and a few Roku models.

Dolby Vision with AV1

Dolby Vision has been the more popular and widespread dynamic HDR format (compared to HDR10+) and now with Apple adding AV1 decoders in their latest generation of processors, they’ve defined how to signal that content within HLS playlists. They are using Dolby Vision Profile 10, which is Dolby’s 10-bit AV1 aware profile. HLS will now support 3 different Dolby Vision profiles: 10, 10.1 and 10.4. Profile 10 is “true” Dolby Vision, 10.1 is their backward compatible version of HDR10 and 10.4 their backward compatible version of Hybrid Log Gamma (HLG). For profiles 10.1 and 10.4, you need to use a SUPPLEMENTAL-CODECS brand attribute and the correct VIDEO-RANGE. For these, 10.1 should use ‘db1p’ and PQ, and 10.4 should use ‘db4h’ and HLG. The full example codec string they provided is: CODECS=”av01.0.13M.10.0.112″,SUPPLEMENTAL-CODECS=”dav1.10.09/db4h”,VIDEO-RANGE=HLG.

If you’re interested in Apple’s overall AV1 Support, you can find more details in this blog post.

Enhanced timed metadata support

HLS now supports multiple concurrent metadata tracks within Fragmented MP4 files, enabling richer media experiences with timed metadata (‘mebx’) tracks. This will enable new opportunities for integrating interactive elements and dynamic content within HLS streams. .

Metrics and logging advancements

The introduction of the AVMetrics API to AVFoundation will allow developers to monitor performance and playback events. This opt-in interface lets you select which subsets of events to monitor and provides detailed insights into media playback, allowing you to optimize streaming experiences further.

More details are available in the AVFoundation documentation and the WWDC 2024 session “Discover media performance metrics in AVFoundation”.

Common Media Client Data (CMCD) standard integration

HLS now supports the CMCD standard, enhancing Quality of Service (QoS) monitoring and delivery optimization through player and CDN interactions. AVPlayer only implemented the preferred mode of transmitting data via HTTP request headers. They have not included support for all of the defined keys and for now is only supported in iOS and tvOS v18 and above. There was no mention of support in Safari. 

Bitmovin and Akamai debuted our joint CMCD solution at NAB 2023. You can learn more in our blog post or check out our demo.

FairPlay content decryption key management

As part of ongoing improvements, HLS is deprecating AVAssetResourceLoader for key loading in favor of AVContentKeySession. AVContentKeySession was first introduced at WWDC 2018 and until now, Apple had been supporting both methods of key loading for content protection in parallel. Using AVContentKeySession promises more flexibility and reliability in content key management, aligning with evolving security and operational requirements. This move means any existing use of AVAssetResourceLoader must be transitioned to AVContentKeySession. 

Conclusion

The recent HLS updates show Apple’s commitment to enhancing media streaming capabilities across diverse platforms and scenarios. For developers and content providers, staying updated with these advancements not only ensures compliance with the latest standards but also unlocks new opportunities to deliver compelling streaming experiences to audiences worldwide. 

If you’re interested in being notified about all of the latest HLS updates or you want to request features or provide feedback, you can subscribe to the IETF hls-interest group.

The post WWDC 2024 HLS Updates for Video Developers appeared first on Bitmovin.

]]>
https://bitmovin.com/blog/hls-updates-wwdc-2024/feed/ 0
Everything you need to know about Apple AV1 Support https://bitmovin.com/blog/apple-av1-support/ https://bitmovin.com/blog/apple-av1-support/#respond Thu, 13 Jun 2024 14:46:40 +0000 https://bitmovin.com/?p=268998 This post was originally published in Sept 2023. It has been updated several time with the latest news and developments, most recently on June 13, 2024 with information about Apple’s AV1 Dolby Vision support. Apple made waves across the video encoding and streaming communities when they announced the iPhone 15 Pro and 15 Pro Max...

The post Everything you need to know about Apple AV1 Support appeared first on Bitmovin.

]]>
This post was originally published in Sept 2023. It has been updated several time with the latest news and developments, most recently on June 13, 2024 with information about Apple’s AV1 Dolby Vision support.

Apple made waves across the video encoding and streaming communities when they announced the iPhone 15 Pro and 15 Pro Max would have a dedicated AV1 hardware decoder, making them the first Apple devices with official AV1 codec support. We’ve compiled all the details from their announcement, the HLS interest group, and product release notes to bring you everything you need to know about Apple AV1 codec support. If you’re looking for more information about AV1 playback on Android, Smart TVs and set-top boxes, you can find more information at https://bitmovin.com/av1-playback-support/. Otherwise, keep reading to learn more!

Hints that Apple AV1 support was coming

Prior to the iPhone 15 announcement in September 2023, there were several indications that Apple would eventually support AV1. Back in 2018, Apple joined the Alliance for Open Media, the organization responsible for creating and promoting AV1 encoding and many took it as a sign that Apple would eventually support AV1. More recently, updates to Apple’s AVFoundation core media framework showed the addition of a new global variable “kCMVideoCodecType_AV1“, and earlier in 2023, the Safari 16.4 Beta release notes actually showed AV1 support was coming, but it was removed without comment shortly after and never added to Safari 16. AV1 WebCodecs support did eventually become available as an experimental  option in the Safari Technology Preview, but enabling it didn’t seem to have any effect.

Still with all of these hints being dropped, the announcements of Apple’s M series of processors and the most recent update to the HLS draft specification in May 2023 all came and went with no mention of AV1. Everyone who was paying close attention and anticipating Apple AV1 support was left disappointed, especially knowing how much weight their decision carried for the rest of the streaming ecosystem. Overall AV1 adoption has been slower than many had hoped and expected, and Apple’s lack of support was often cited as a reason to wait and avoid updating video encoding stacks. 

iPhone 15 Pro announcement

This all changed on September 12, 2023, when Apple announced their new A17 Pro mobile processor would include support for AV1 hardware decoding. You can watch the full replay here, with the section about the 15 Pro’s new processor beginning at 1:01:20. VP of the Apple Silicon Engineering Group, Sribalan Santhanam presented the new A-series processor and shared details about the industry’s first 3 nm chip, including a 6-core CPU and a new Pro-class, 6-core GPU. It also has a 16-core neural engine that can process up to 35 trillion operations per second and run machine learning models on the device, without sending personal data to the cloud. It also includes a dedicated engine for Apple’s own ProRes codec in addition to the big one for video streaming services, the AV1 hardware decoder. 

Apple AV1 decoder block diagram
Block diagram of Apple’s A17 Pro chip, highlighting dedicated AV1 decoder – Image source: Apple iPhone 15 Pro announcement

“We also included a dedicated AV1 decoder, enabling more efficient and high-quality video experiences for streaming services.”

Sribalan Santhanam – VP, Apple Silicon Engineering Group

More details about HDR, DRM, HLS and Safari support for AV1

After the presentation, co-author of the HLS specification Roger Pantos shared more details via the hls-interest mailing list. He confirmed that indeed, that both the iPhone 15 Pro and 15 Pro Max would be the first Apple devices with hardware decoding support for AV1 video content. The dedicated hardware meant that in addition to Standard Dynamic Range (SDR) content, it would also support High Dynamic Range (HDR10) as well as content that was protected by FairPlay Streaming DRM, things that software decoders typically cannot handle well or securely. Playback would be supported in Apple’s native AVPlayer or AVSampleBufferDisplayLayer, including using Media Source Extensions (MSE), or Managed Media Source (MMS) as Apple calls their new version, under an experimental setting on iOS Safari.

HLS playback of AV1 will work without any new signaling requirements, just the regular CODEC and VIDEO-RANGE attributes. The SCORE attribute can also be used to force the playback client to prefer AV1 over other encodings, but renditions encoded with AVC and/or HEVC should still be included for older devices and AirPlay support. The WebKit blog provided more information about Safari 17.0, confirming support for the AV1 video codec was added on devices with hardware decoding support. They also shared this html code snippet for presenting single-file progressive video that has been encoded with AV1, HEVC and VP9, which allows the browser to choose the best option for playback. It should be noted that outside of very short clips, adaptive streaming with HLS is preferred over progressive streaming in order to provide the best quality of experience and bandwidth efficiency.

mv-hevc - Bitmovin
html snippet for multi-codec progressive video with AV1, HEVC and VP9 – Image source: webkit.org blog

The ‘type’ attribute signals the type of container being used and ‘codecs’ parameter string lets the browser know which codec was used and other characteristics like profile, level, color space, bit depth and dynamic range. This informs the browser and lets it decide whether it supports those attributes or needs to fall back on an older codec. It’s also possible to use a simpler codecs=”av01”, but it’s best to provide as much detail as possible if you can. More information on the AV1 codecs parameter string from the Alliance for Open Media can be found here, and details about codec and profile parameters are available in this IETF doc

While not directly related to the Apple AV1 news, Safari 17.0 also added a new media player stats overlay similar to YouTube’s “stats for nerds”. This is a nice addition for video developers doing any troubleshooting and will be very helpful as people begin experimenting with adding AV1 encoding. It’s available to anyone who checks the “Show features for web developers” box in the advanced settings of Safari.  

Apple media stats overlay
New Media stats overlay feature available in Safari 17.0 – Image source: webkit.org blog

Apple M3 processor announcement

In late October 2023, Apple announced their newest generation of desktop processors would include AV1 hardware decoders. This includes the M3, M3 Pro and M3 Max chips, meaning all new models of Macbooks, iMacs and desktop computers with an M3 processor will support AV1 video playback. Some were disappointed that the M3 did not also include support for AV1 encoding, but for video playback, the decoding is all that really matters, so this will be another nice wave of new devices that streaming services can target with AV1 encoded video. 

Apple M3 family of processors with AV1 video decoding support, M3, M3 Pro and M3 Max
Apple’s new M3 family of processors with AV1 decoding support (Source: Apple)

Apple M4 processor iPad announcement

Announced in May 2024, the new iPad Pro is powered by Apple’s latest system on a chip, the M4. The media engine of the M4 supports multiple codecs, including H.264, HEVC, ProRes and now AV1, making it the most advanced media processor ever in an iPad. With this, Apple continues their march toward full AV1 support. Will the Vision Pro 2 be next?

Apple AV1 Dolby Vision Support

Usually around the time of Apple’s World Wide Developer Conference there are some new updates or features around HLS and AVPlayer. During WWDC24, Apple shared a “What’s new in HTTP Live Streaming 2024” doc with several interesting new additions. For AV1 specifically, they called out support for using Dolby Vision Profile 10, which is Dolby’s 10-bit AV1 aware profile. Apple now supports 3 different Dolby Vision profiles: 10, 10.1 and 10.4. Profile 10 is “true” Dolby Vision, 10.1 is their backward compatible version of HDR10 and 10.4 their backward compatible version of Hybrid Log Gamma (HLG). For profiles 10.1 and 10.4, you need to use a SUPPLEMENTAL-CODECS attribute and the correct VIDEO-RANGE. For these, 10.1 should use ‘db1p’ and PQ, and 10.4 should use ‘db4h’ and HLG. The full example codec string they provided is: CODECS=”av01.0.13M.10.0.112″,SUPPLEMENTAL-CODECS=”dav1.10.09/db4h”,VIDEO-RANGE=HLG.


AV1 Software Decoding Support?

When Apple released the iPhone 6s with the A9 chip, it became the first iOS device to support HEVC(H.265) hardware decoding, which included support for FairPlay Streaming with HEVC. When this happened, they also included an HEVC software decoder as part of the next iOS and macOS updates for older devices without hardware support. While the software decoding didn’t support FairPlay Streaming, it was still a big boost for HEVC support and was one of the first things we wondered about after seeing the AV1 decoder announcement.

Unfortunately when asked, Roger Pantos shared that Apple would not be shipping an AV1 video software decoder at this time. He did confirm that iOS 17 does include some AV1 codec support, but only for still images using the Alliance for Open Media’s AVIF format. For now, we can only hope that AV1 video software decoding (like Meta is already using in their iOS apps) will be coming soon.

mv-hevc - Bitmovin
Screenshot comparing H.264, VP9 and AV1 video codec quality for low bandwidth streams. Source: Meta Engineering Blog

Ready to take advantage of AV1 Encoding?

Bitmovin has been ready for AV1 adoption to spread for some time now, dating back to 2017 when we partnered with Mozilla to enable AV1 playback in the Firefox browser using the Bitmovin Player. We’ve added AV1 codec support to our Per-Title and 3-pass encoding optimizations and just recently made AV1 encoding available in our dashboard UI, so now you can perform your first AV1 encode without any code, API calls, or configuration necessary! Bitmovin’s AV1 encoding has supported DASH streaming together with Widevine content protection for a long time, but we’ve now also added support for fMP4 in HLS playlists together with FairPlay content protection to take advantage of Apple AV1 support for premium content. It’s also available in our free trial, so there’s never been a better time to check it out and begin taking advantage of the bandwidth savings and quality improvements that AV1 can provide. 

Screenshot of Bitmovin Dashboard Encoding Configuration with new AV1 video codec support
Bitmovin Dashboard Encoding Configuration with new AV1 video codec support

Click here to start your free trial today!

  • Read the latest info about our AV1 playback support and device testing here.
  • Learn how using Bitmovin’s Per-Title Encoding together with AV1 can let you stream 4K video at bitrates that had been limited to Standard Definition with older codecs. 
  • Check out our AV1 hub and download our datasheet to learn all about the codec’s development, performance and how it can lower your CDN costs.

The post Everything you need to know about Apple AV1 Support appeared first on Bitmovin.

]]>
https://bitmovin.com/blog/apple-av1-support/feed/ 0
New Firefox AV1 support for Encrypted Media Extensions https://bitmovin.com/blog/firefox-av1-support/ https://bitmovin.com/blog/firefox-av1-support/#respond Thu, 30 May 2024 01:12:17 +0000 https://bitmovin.com/?p=281752 This post covers some recent updates, focusing on the new Firefox AV1 support in Encrypted Media Extensions. Bitmovin has been supporting and advocating for use of the AV1 codec for several years, even though there have been gaps in playback support preventing adoption for some workflows. Slowly but surely, those gaps are being filled and the...

The post New Firefox AV1 support for Encrypted Media Extensions appeared first on Bitmovin.

]]>

Table of Contents

This post covers some recent updates, focusing on the new Firefox AV1 support in Encrypted Media Extensions. Bitmovin has been supporting and advocating for use of the AV1 codec for several years, even though there have been gaps in playback support preventing adoption for some workflows. Slowly but surely, those gaps are being filled and the reasons not to use AV1 are going away. Keep reading to learn more.

Firefox 125 adds support for encrypted AV1

A couple of years ago, Bitmovin began testing several different combinations of AV1 encoding, muxing and DRM support across browsers and playback devices. We were somewhat surprised to learn that even though Firefox was the first major browser to support AV1 playback, they had not implemented support for encrypted AV1 as they had for other codecs. We found there was actually an open bug/request filed 5 years ago. 

Shortly after we began watching closely, there was an update…

Screenshot of update to bug report about lack of AV1 Widevine support in Firefox. Since then, Firefox AV1 support has improved with support for encrypted media extensions in version 125.

Ouch. Once the ticket got reassigned, Bitmovin got involved and gave our feedback that for premium/studio content, this support would be needed soon. We also provided a Widevine-protected sample for them to use in testing. Fast-forward to this spring, we saw some action on the ticket and support for AV1 with Encrypted Media Extensions was officially added to Firefox 125!

This means premium content workflows can now use AV1 on all of the major desktop browsers. Apple added support to Safari last fall, including with FairPlay Streaming, but for now it’s limited to devices with AV1 hardware decoders (iPhone 15 Pro, iPad Pro, new Macs with M3 processors).

Previous Bitmovin and Firefox AV1 collaboration

Way back in 2017, before the AV1 spec was finalized, Bitmovin and Firefox collaborated on the first HTML5 AV1 playback. Because the bitstream was still under development and subject to change, Bitmovin and Mozilla agreed on a common codec string to ensure compatibility between the version in the Bitmovin encoder and the decoder in Mozilla Firefox. It was made available in Mozilla’s experimental development version, Firefox Nightly, for users to manually enable. 

Even earlier in 2017, Bitmovin demonstrated the first broadcast quality AV1 live stream at NAB, winning a Best of Show award from Streaming Media Magazine. 

Other recent AV1 playback updates

Android adds dav1d decoder

In March 2024, VideoLAN’s “dav1d” became available to all Android devices running Android 12 or higher. Apps need to opt-in to using AV1 for now, but according to Google, most devices can at least keep up with software decoding of 720p 30fps video. YouTube initially opted to begin using dav1d on devices without a hardware decoder, but may have reverted that decision, likely due to battery concerns on phones. For plug-in Android devices, dav1d is still a great option and a welcome addition to the ecosystem.

iPad Pro gets AV1 playback support with M4 processor

In early May 2024, Apple continued their march toward full AV1 support with the announcement of their new M4 chip, which will power the new iPad Pro. The Media Engine of M4 is the most advanced to come to iPad, supporting several popular video codecs, like H.264, HEVC, and ProRes, in addition to AV1.

Ready to get started with AV1?

Bitmovin has added AV1 codec support to our Per-Title and 3-pass encoding optimizations and made AV1 encoding available in our dashboard UI, so now you can perform your first AV1 encode without any code, API calls, or configuration necessary! Bitmovin’s AV1 encoding has supported DASH streaming together with Widevine content protection for a long time, but we’ve now also added support for fMP4 in HLS playlists together with FairPlay content protection to take advantage of Apple AV1 support for premium content. It’s also available in our free trial, so there’s never been a better time to check it out and begin taking advantage of the bandwidth savings and quality improvements that AV1 can provide.

mv-hevc - Bitmovin

Website: Bitmovin’s AV1 hub   

Blog: State of AV1 Playback Support

Blog: Everything you need to know about Apple’s AV1 Support

Blog: 4K video at SD bitrates with AV1

The post New Firefox AV1 support for Encrypted Media Extensions appeared first on Bitmovin.

]]>
https://bitmovin.com/blog/firefox-av1-support/feed/ 0
The State of AV1 Playback Support: 2024 https://bitmovin.com/blog/av1-playback-support/ https://bitmovin.com/blog/av1-playback-support/#comments Thu, 16 May 2024 14:51:10 +0000 https://bitmovin.com/?p=244139 This post was originally published in October 2022. It has been updated with new developments, most recently on May 16, 2024 with news about Apple’s iPad AV1 decoder and Firefox encrypted media extensions support. In this post, I’ll be taking a look at the current state of AV1 playback support, covering which browsers, mobile devices,...

The post The State of AV1 Playback Support: 2024 appeared first on Bitmovin.

]]>
This post was originally published in October 2022. It has been updated with new developments, most recently on May 16, 2024 with news about Apple’s iPad AV1 decoder and Firefox encrypted media extensions support.

In this post, I’ll be taking a look at the current state of AV1 playback support, covering which browsers, mobile devices, smart TVs, consoles and streaming sticks are compatible with the AV1 codec right now.  I’ll also touch on some of the incredible bandwidth savings companies like Netflix are seeing with AV1 and detail the latest announcements, rumors and speculation around future AV1 playback support.

AV1: The Story So Far (2017-2023)

Back in 2017, Bitmovin debuted the world’s first AV1 live encoding at the NAB Show in Las Vegas, earning a Best of NAB award. While it was an exciting proof of concept at the time, AV1 playback support was extremely limited and large-scale production usage wouldn’t come until years later. In 2020, YouTube and Netflix began delivering AV1 to the first compatible Android devices, and last year Netflix shared details about their expanded use of AV1 for 4K streams.

Netflix also published a report that showed over the course of one month in early 2022, 21% of their streamed content benefited from the most recent improvements in codec efficiency, like Per-Title optimized AV1 and HEVC. They estimated that without those improvements, total Netflix traffic globally would have been around 24% higher, proving that you can see massive bandwidth and overall cost savings by encoding just a portion of your most popular content with AV1.

Apple adds AV1 hardware decoding support to iPhone 15 Pro and new Macbooks

Many of us who have been tracking the adoption and progress of AV1 were disappointed when the announcements for Apple’s M-series processors over the past couple years did not include AV1 hardware decoding support. But on September 12, 2023, the big moment we’ve been waiting for finally arrived when Apple announced that the A17 Pro chip in their new iPhone 15 Pro would include a dedicated AV1 decoder. This is a big line in the sand for Apple and for the wider industry and will hopefully prove to be the day that revitalized interest and momentum for AV1 adoption across the industry.

Apple A17 Pro chip in iPhone 15 Pro with dedicated AV1 decoder that will enable AV1 playback support
Apple A17 Pro chip in iPhone 15 Pro with dedicated AV1 decoder

“We also included a dedicated AV1 decoder, enabling more efficient and high-quality video experiences for streaming services.”

Sribalan Santhanam – VP, Apple Silicon Engineering Group

After the presentation, co-author of the HLS spec Roger Pantos shared more details via the hls-interest mailing list: 

The iPhone 15 Pro (both screen sizes) will be the first Apple product to support hardware decode of AV1 content. This includes SDR, HDR10, and content protected by FairPlay Streaming, played back through either AVPlayer or AVSampleBufferDisplayLayer (including MSE on Safari).

There is no new signaling necessary for HLS, just the regular content-specific values for the CODECS and VIDEO-RANGE attributes in the MVP. If you wish, you can use the SCORE attribute to make the client prefer AV1 over other encodings (but please continue to provide renditions encoded with AVC and/or HEVC for compatibility with earlier devices and AirPlay).

A month later in October 2023, Apple announced their newest generation of desktop processors would include AV1 hardware decoders. This includes the M3, M3 Pro and M3 Max chips, meaning all new models of Macbooks, iMacs and desktop computers with an M3 processor will also support AV1 video playback.

Earlier in 2023, while everyone was waiting for Apple to officially support AV1, Meta took matters into their own hands, sharing how they brought AV1 to their Reels videos for Facebook and Instagram, including on iOS devices. This became possible through ongoing open source software decoding efficiency improvements, in particular with the dav1d decoder, developed by VideoLAN. Meta also said they believe for their video products, AV1 is the most viable codec for the coming years. The image below shows how they significantly improved visual quality with AV1 over VP9 and H.264, while keeping the bitrate constant.

Visual codec quality comparison of H.264, VP9 and AV1 playback
Screenshot comparing video codec quality for low bandwidth streams. Source: Meta Engineering Blog

At Bitmovin we also believe in the potential of AV1 and have explored the possibilities of software decoding on mobile devices. At a recent internal hackathon, one of our senior software engineers, Roland Kákonyi, built a custom iOS player using the dav1d decoder that was able to decode and smoothly play 1080p AV1 content. We’ll continue exploring this further as a way to fill gaps in playback coverage for devices lacking hardware support.

AV1 Playback Support News in 2024

Following 2023’s big announcements from Apple, 2024 got off to a strong start with Android, Firefox and (again) Apple adding new AV1 playback support. The barriers and arguments against adopting AV1 continue falling, slowly, but surely.

Android adds dav1d decoder

In March 2024, VideoLAN’s “dav1d” became available to all Android devices running Android 12 or higher. Apps need to opt-in to using AV1 for now, but according to Google, most devices can at least keep up with software decoding of 720p 30fps video. YouTube initially opted to begin using dav1d on devices without a hardware decoder, but may have reverted that decision, likely due to battery concerns on phones. For plug-in Android devices, dav1d is still a great option and a welcome addition to the ecosystem.

Firefox adds AV1 support in Encrypted Media Extensions

While Firefox was the first major browser to support AV1 playback, a long-standing bug (or lack of implementation) prevented DRM-protected AV1 from playing. When Apple added support to Safari for HLS + FairPlay streaming, it meant Firefox was the only major browser that still did not support premium, secure content. That changed in April 2024, when Firefox 125 added AV1 support in encrypted media extensions, meaning Widewine-protected AV1 is now supported.

iPad Pro gets AV1 playback support with M4 processor

In early May 2024, Apple continued their march toward full AV1 support with the announcement of their new M4 chip, which will power the new iPad Pro. The Media Engine of M4 is the most advanced to come to iPad, supporting several popular video codecs, like H.264, HEVC, and ProRes, in addition to AV1.

Current State of AV1 Playback support

To answer the question of current playback support as thoroughly as possible, we created several sample streams with different combinations of containers, muxings and DRM. While there will be some exceptions and omissions, especially when you go back to the 2021 and 2020 models, I’ll use the emojis below to show the general level of support you can expect from these platforms and brands right now and give the full results of our direct testing in the table at the end

  • ✅💯 Fully Supported – Successful AV1 playback with all test streams, including DRM
  • ✅ Partial or Documented Support – Successfully played at least one, but not all of our test streams OR the product documentation claims AV1 playback support, but has not yet been verified by Bitmovin
  • ❌ Not Supported – AV1 playback not supported here currently

Browsers and Operating Systems

✅💯 Chrome

✅💯 Edge

✅ Firefox

✅ Safari*

✅💯 Android 

✅ Windows

✅ iOS / macOS **

*Safari 17 or later, when a hardware decoder is present

**AV1 is also supported in Chrome and Firefox on macOS

Generally speaking, the Chrome browser and Android ecosystem handle AV1 well across phones, tablets, smart TVs and set-top boxes/streaming sticks. Unfortunately, the same cannot be said for Safari and iOS where support had been lacking until the iPhone 15 Pro announcement.

Firefox was the first major browser to support AV1, and recently Firefox 125 added support for AV1 in Encrypted Media Extensions, meaning Widevine-protected content is now playable.

The Edge browser on Windows 10 and later supports AV1, but you may need to install the free AV1 Video Extension from the Microsoft Store. 

For more details about the specific versions and less common browsers that support AV1, check out the table from CanIUse.com here

Smart TVs

✅  Android TV

✅  Google TV

✅  Samsung

✅  Sony

✅  LG

✅  Amazon Fire TV

As mentioned, Android handles AV1 quite nicely, which also applies to the Smart TVs running Android TV and Google TV operating systems. These include Sony Google TV models from 2021 on and many Amazon Fire TV models as far back as 2020. (FireOS is based on Android)

Samsung TVs (and phones) from late 2020 onward have AV1 hardware decoders and were mentioned by Netflix as some of the first outlets for their 4K AV1 content. 

LG has developer documentation stating AV1 is supported for their UHD TVs and projectors running WebOS 5.0 and above, although our testing on some 2020 models was unsuccessful.

Consoles and Streaming Sticks

✅💯 Amazon Fire TV Stick 4K Max

✅ Playstation 4 Pro

✅ Xbox One

✅ Roku Streaming Stick 4K

Playstation 4 Pro was also called out by Netflix as one of the targets for their 4K AV1 streams and it takes advantage of GPU-accelerated decoding. Netflix didn’t publicly mention delivering AV1 to Xbox One, but the same decode libraries that the PS4 Pro uses were first made available for Xbox One, so it should be possible.

The Amazon Fire TV Stick 4K Max has AV1 + DRM support, making it one of the cheapest and best options for giving older 4K TVs an AV1 upgrade. 

Roku is a little bit of a gray area at the moment. Officially, they still do not support AV1 as an adaptive streaming video codec, but newer models like the Roku Ultra that have a USB port do support AV1 playback via USB media. There does appear to be some level of support for AV1 adaptive streaming, as the YouTube “stats for nerds” overlay reveals a combination of AV1 video and opus audio playing on many of the popular recommended videos. Hopefully wider support is coming, but in the meantime, did confirm successful playback of our single file “progressive” AV1 MP4 files on the Streaming Stick 4K.

YouTube “Stats for nerds” overlay showing AV1 video playing on Roku Streaming Stick 4K
YouTube “Stats for nerds” showing AV1 video playing on Roku Streaming Stick 4K

Looking Ahead: Future AV1 Playback Support

Even with gaps in support on some platforms, there is plenty of opportunity to see tangible bandwidth savings and quality improvements from AV1 right now and thankfully, the future looks even brighter. Intel, AMD, Samsung and Qualcomm have all announced additional AV1 support coming at the chip level.

Will Apple add AV1 software decoding support for older devices? 

There have been several indications that Apple would eventually support AV1. Apple joined the Alliance for Open Media, the organization responsible for creating and promoting AV1 encoding, back in 2018, which many took as a sign that Apple would eventually support it. We’re hopeful that with the addition of AV1 hardware decoding support to the iPhone Pro 15, iPad Pro and Macbooks, Apple will also add official HLS support and fallback software decoding for older devices that are capable.

Conclusion

While AV1 support and adoption has been on the rise and we’ve seen some encouraging announcements, universal support like we have with H.264 is just not there yet. That means AV1 will need to be part of a multi-codec approach for the foreseeable future, but that’s ok! Not that long ago, it took millions of views to offset the higher encoding costs of AV1, but with recent improvements, we’ve seen the break-even point drop to as low as 4,000 views! So for a whole lot of content, encoding with AV1 can already save you money right now and those savings will only increase as more supporting devices become available. 

Ready to get started with AV1 encoding? You can try it for free with a Bitmovin Trial, sign up here!

Video CodecChromeEdgeFirefoxSafariAndroid NativeAndroid WebiOSFire TV MaxFire TV Max Web (Silk Browser)Roku Streaming Stick 4KSamsung Tizen (2020-2021)
fMP4 (DASH)mv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovin
fMP4 with Widevine and Playready (DASH)mv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovin
Single file “progressive” MP4 (.mp4)mv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovin
Single file “progressive” MP4 + Widevine (DASH)mv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovin
WebM (DASH)mv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovin
WebM + Widevine (DASH)mv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovin
Single file “progressive” WebM (DASH)mv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovin
Single file “progressive” WebM + Widevine (DASH)mv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovin
fMP4 (HLS)mv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovin
fMP4 + Fairplay (HLS)mv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovinmv-hevc - Bitmovin

The post The State of AV1 Playback Support: 2024 appeared first on Bitmovin.

]]>
https://bitmovin.com/blog/av1-playback-support/feed/ 1
The AI Video Research Powering a Higher Quality Future  https://bitmovin.com/blog/ai-video-research/ https://bitmovin.com/blog/ai-video-research/#comments Sun, 05 May 2024 22:06:17 +0000 https://bitmovin.com/?p=262405 *This post was originally published in June 2023. It was updated in May 2024 with more recent research publications and updates.* This post will summarize the current state of Artificial Intelligence (AI) applications for video in 2024, including recent progress and announcements. We’ll also take a closer look at AI video research and collaboration between...

The post The AI Video Research Powering a Higher Quality Future  appeared first on Bitmovin.

]]>
*This post was originally published in June 2023. It was updated in May 2024 with more recent research publications and updates.*

This post will summarize the current state of Artificial Intelligence (AI) applications for video in 2024, including recent progress and announcements. We’ll also take a closer look at AI video research and collaboration between Bitmovin and the ATHENA laboratory that has the potential to deliver huge leaps in quality improvements and bring an end to playback stalls and buffering. This includes ATHENA’s FaRes-ML, which was recently granted a US Patent. Keep reading to learn more!

AI for video at NAB 2024

At NAB 2024, the AI hype train continued gaining momentum and we saw more practical applications of AI for video than ever before. We saw various uses of AI-powered encoding optimization, Super Resolution upscaling, automatic subtitling and translations, and generative AI video descriptions and summarizations. Bitmovin also presented some new AI-powered solutions, including our Analytics Session Interpreter, which won a Best of Show award from TV Technology. It uses machine learning and large language models to generate a summary, analysis and recommendations for every viewer session. The early feedback has been positive and we’ll continue to refine and add more capabilities that will help companies better understand and improve their viewers’ experience.

Product Manager Jacob Arends, CEO Stefan Lederer and Engineer Peter Eder accepting the award for Bitmovin's AI-powered Analytics Session Interpreter, which was a product of Bitmovin's AI video research
L to R: Product Manager Jacob Arends, CEO Stefan Lederer and Engineer Peter Eder accepting the award for Bitmovin’s AI-powered Analytics Session Interpreter

Other AI highlights from NAB included Jan Ozer’s “Beyond the Hype: A Critical look at AI in Video Streaming” presentation, NETINT and Ampere’s live subtitling demo using OpenAI Whisper, and Microsoft and Mediakind sharing AI applications for media and entertainment workflows. You can find more detail about these sessions and other notable AI solutions from the exhibition floor in this post.

FaRes-ML granted US Patent

For a few years before this recent wave of interest, Bitmovin and our ATHENA project colleagues have been researching the practical applications of AI for video streaming services. It’s something we’re exploring from several angles, from boosting visual quality and upscaling older content to more intelligent video processing for adaptive bitrate (ABR) switching. One of the projects that was first published in 2021 (and covered below in this post) is Fast Multi-Resolution and Multi-Rate Encoding for HTTP Adaptive Streaming Using Machine Learning (FaRes-ML). We’re happy to share that FaRes-ML was recently granted a US Patent! Congrats to the authors, Christian Timmerer, Hadi Amirpour, Ekrem Çetinkaya and the late Prof. Mohammad Ghanbari, who sadly passed away earlier this year.

Recent Bitmovin and ATHENA AI Research

In this section, I’ll give a short summary of projects that were shared and published since the original publication of this blog, and link to details for anyone interested in learning more. 

Generative AI for Adaptive Video Streaming

Presented at the 2024 ACM Multimedia Systems Conference, this research proposal outlines the opportunities at the intersection of advanced AI algorithms and digital entertainment for elevating quality, increasing user interactivity and improving the overall streaming experience. Research topics that will be investigated include AI generated recommendations for user engagement and AI techniques for reducing video data transmission. You can learn more here.

DeepVCA: Deep Video Complexity Analyzer

The ATHENA lab developed and released the open-source Video Complexity Analyzer (VCA) to extract and predict video complexity faster than existing method’s like ITU-T’s Spatial Information (SI) and Temporal Information (TI). DeepVCA extends VCA using deep neural networks to accurately predict video encoding parameters, like bitrate, and the encoding time of video sequences. The spatial complexity of the current frame and previous frame are used to rapidly predict the temporal complexity of a sequence, and the results show significant improvements over unsupervised methods. You can learn more and access the source code and dataset here.

mv-hevc - Bitmovin
DeepVCA’s spatial and temporal complexity prediction process

DIGITWISE: Digital Twin-based Modeling of Adaptive Video Streaming Engagement

DIGITWISE leverages the concept of a digital twin, a digital replica of an actual viewer, to model user engagement based on past viewing sessions. The digital twin receives input about streaming events and utilizes supervised machine learning to predict user engagement for a given session. The system model consists of a data processing pipeline, machine learning models acting as digital twins, and a unified model to predict engagement (XGBoost). The DIGITWISE system architecture demonstrates the importance of personal user sensitivities, reducing user engagement prediction error by up to 5.8% compared to non-user-aware models. It can also be used to optimize content provisioning and delivery by identifying the features that maximize engagement, providing an average engagement increase of up to 8.6 %.You can learn more here.

System overview diagram of DIGITWISE user engagement prediction, part of ATHENA's AI video research
System overview of DIGITWISE user engagement prediction

Previous Bitmovin and ATHENA AI Research

Better quality with neural network-driven Super Resolution upscaling

The first group of ATHENA publications we’re looking at all involve the use of neural networks to drive visual quality improvements using Super Resolution upscaling techniques. 

DeepStream: Video streaming enhancements using compressed deep neural networks

Deep learning-based approaches keep getting better at enhancing and compressing video, but the quality of experience (QoE) improvements they offer are usually only available to devices with GPUs. This paper introduces DeepStream, a scalable, content-aware per-title encoding approach to support both CPU-only and GPU-available end-users. To support backward compatibility, DeepStream constructs a bitrate ladder based on any existing per-title encoding approach, with an enhancement layer for GPU-available devices. The added layer contains lightweight video super-resolution deep neural networks (DNNs) for each bitrate-resolution pair of the bitrate ladder. For GPU-available end-users, this means ~35% bitrate savings while maintaining equivalent PSNR and VMAF quality scores, while CPU-only users receive the video as usual. You can learn more here.

mv-hevc - Bitmovin
DeepStream system architecture

LiDeR: Lightweight video Super Resolution for mobile devices

Although DNN-based Super Resolution methods like DeepStream show huge improvements over traditional methods, their computational complexity makes it hard to use them on devices with limited power, like smartphones. Recent improvements in mobile hardware, especially GPUs, made it possible to use DNN-based techniques, but existing DNN-based Super Resolution solutions are still too complex. This paper proposes LiDeR, a lightweight video Super Resolution network specifically tailored toward mobile devices. Experimental results show that LiDeR can achieve competitive Super Resolution performance with state-of-the-art networks while improving the execution speed significantly. You can learn more here or watch the video presentation from an IEEE workshop.

mv-hevc - Bitmovin
Quantitative results comparing Super Resolution methods. LiDeR achieves near equivalent PSNR and SSIM quality scores while running ~3 times faster than its closest competition.

Super Resolution-based ABR for mobile devices

This paper introduces another new lightweight Super Resolution network, SR-ABR Net, that can be deployed on mobile devices to upgrade low-resolution/low-quality videos while running in real-time. It also introduces a novel ABR algorithm, WISH-SR, that leverages Super Resolution networks at the client to improve the video quality depending on the client’s context. By taking into account device properties, video characteristics, and user preferences, it can significantly boost the visual quality of the delivered content while reducing both bandwidth consumption and the number of stalling events. You can learn more here or watch the video presentation from Mile High Video.

mv-hevc - Bitmovin
System architecture for proposed Super Resolution based adaptive bitrate algorithm

Less buffering and higher QoE with applied machine learning

The next group of research papers involve applying machine learning at different stages of the video workflow to improve QoE for the end user.

FaRes-ML: Fast multi-resolution, multi-rate encoding

Fast multi-rate encoding approaches aim to address the challenge of encoding multiple representations from a single video by re-using information from already encoded representations. In this paper, a convolutional neural network is used to speed up both multi-rate and multi-resolution encoding for ABR streaming. Experimental results show that the proposed method for multi-rate encoding can reduce the overall encoding time by 15.08% and parallel encoding time by 41.26%. Simultaneously, the proposed method for multi-resolution encoding can reduce the encoding time by 46.27% for the overall encoding and 27.71% for the parallel encoding on average. You can learn more here.

mv-hevc - Bitmovin
FaRes-ML flowchart

ECAS-ML: Edge assisted adaptive bitrate switching

As video streaming traffic in mobile networks increases, utilizing edge computing support is a key way to improve the content delivery process. At an edge node, we can deploy ABR algorithms with a better understanding of network behavior and access to radio and player metrics. This project introduces ECAS-ML, Edge Assisted Adaptation Scheme for HTTP Adaptive Streaming with Machine Learning. It uses machine learning techniques to analyze radio throughput traces and balance the tradeoffs between bitrate, segment switches and stalls to deliver a higher QoE, outperforming other client-based and edge-based ABR algorithms. You can learn more here.

mv-hevc - Bitmovin
ECAS-ML system architecture

Challenges ahead

The road from research to practical implementation is not always quick or direct or even possible in some cases, but fortunately that’s an area where Bitmovin and ATHENA have been working together closely for several years now. Going back to our initial implementation of HEVC encoding in the cloud, we’ve had success using small trials and experiments with Bitmovin’s clients and partners to provide real-world feedback for the ATHENA team, informing the next round of research and experimentation toward creating viable, game-changing solutions. This innovation-to-product cycle is already in progress for the research mentioned above, with promising early quality and efficiency improvements.  

Many of the advancements we’re seeing in AI are the result of aggregating lots and lots of processing power, which in turn means lots of energy use. Even with processors becoming more energy efficient, the sheer volume involved in large-scale AI applications means energy consumption can be a concern, especially with increasing focus on sustainability and energy efficiency.  From that perspective, for some use cases (like Super Resolution) it will be worth considering the tradeoffs between doing server-side upscaling during the encoding process and client-side upscaling, where every viewing device will consume more power.  

Learn more

Want to learn more about Bitmovin’s AI video research and development? Check out the links below. 

Analytics Session Interpreter webinar

AI-powered video Super Resolution and Remastering

Super Resolution blog series

Super Resolution with Machine Learning webinar

Athena research

MPEG Meeting Updates 

GAIA project blogs

AI Video Glossary

Machine Learning – Machine learning is a subfield of artificial intelligence that deals with developing algorithms and models capable of learning and making predictions or decisions based on data. It involves training these algorithms on large datasets to recognize patterns and extract valuable insights. Machine learning has diverse applications, such as image and speech recognition, natural language processing, and predictive analytics.

Neural Networks – Neural networks are sophisticated algorithms designed to replicate the behavior of the human brain. They are composed of layers of artificial neurons that analyze and process data. In the context of video streaming, neural networks can be leveraged to optimize video quality, enhance compression techniques, and improve video annotation and content recommendation systems, resulting in a more immersive and personalized streaming experience for users.

Super Resolution – Super Resolution upscaling is an advanced technique used to enhance the quality and resolution of images or videos. It involves using complex algorithms and computations to analyze the available data and generate additional details. By doing this, the image or video appears sharper, clearer, and more detailed, creating a better viewing experience, especially on 4K and larger displays. 

Graphics Processing Unit (GPU) – A GPU is a specialized hardware component that focuses on handling and accelerating graphics-related computations. Unlike the central processing unit (CPU), which handles general-purpose tasks, the GPU is specifically designed for parallel processing and rendering complex graphics, such as images and videos. GPUs are widely used in various industries, including gaming, visual effects, scientific research, and artificial intelligence, due to their immense computational power.

Video Understanding – Video understanding is the ability to analyze and comprehend the information present in a video. It involves breaking down the visual content, movements, and actions within the video to make sense of what is happening.

The post The AI Video Research Powering a Higher Quality Future  appeared first on Bitmovin.

]]>
https://bitmovin.com/blog/ai-video-research/feed/ 1
NAB Video AI Highlights https://bitmovin.com/blog/nab-video-ai/ https://bitmovin.com/blog/nab-video-ai/#comments Fri, 26 Apr 2024 18:47:09 +0000 https://bitmovin.com/?p=280235 For the past few years, AI has been one of the top buzzwords at the NAB Show. While other hot topics like “web3” seem to have peaked and faded, interest in video AI has continued to grow and this year there were more practical solutions being showcased than ever before. A personal highlight for Bitmovin...

The post NAB Video AI Highlights appeared first on Bitmovin.

]]>

For the past few years, AI has been one of the top buzzwords at the NAB Show. While other hot topics like “web3” seem to have peaked and faded, interest in video AI has continued to grow and this year there were more practical solutions being showcased than ever before. A personal highlight for Bitmovin was winning a TV Technology Best of Show award for our AI-powered Analytics session interpreter. Keep reading to learn more about other interesting and useful applications of AI that we saw at NAB 2024.

NAB Video AI Highlights: 2024

While there was some variation in implementation and features, the majority of the AI solutions I encountered at NAB fell into one of these categories:

  • Generative AI (genAI) for video creation, post-production, or summaries and descriptions
  • Automatic subtitling and captioning with multi-language translations
  • Object or event detection and indexing
  • Video quality enhancement

This summary is definitely not exhaustive, but highlights some of the things that stood out to me on the show floor and in the conference sessions. Please let us know in the comments if you saw anything else noteworthy.

Booths and Exhibits

Adobe

Adobe has been showing AI-powered editing and post-production tools as part of their creative suite for a couple years now and they seem to be continuously improving. They teased a new Firefly video model that will be coming to Premiere Pro later this year that will enable a few new Photoshop-like tools for video. Generative Extend will allow you to extend clips with AI generated frames for perfectly timed edits and the new Firefly model will also enable object removal, addition, and replacement. They’ve also implemented content credentials into the platform that will signal when generative AI was used in the creation process and which models were used, as they prepare for supporting 3rd party genAI models like OpenAI’s Sora.  

Amazon Web Services (AWS)

AWS had one of the busiest booths in the West hall and were showcasing several AI-powered solutions, including using genAI for creating personalized ads and Intel’s Video Super Resolution upscaling. But they also had the most eye-catching and fun application of AI in the South Hall, a genAI golf simulator where you could design and play your own course.

NAB Video AI application - Generative AI golf simulator by AWS
AWS GenAI-powered golf simulator

axle.ai

Axle.ai was sharing their face, object, and logo recognition technology that can index recognized objects and search for matching objects in other videos or clips. Their software also has automatic voice transcription and translation capabilities. It can run either on-premises or in the cloud and integrates with Adobe Premiere, Final Cut Pro and other editing suites. While other companies offer similar capabilities, they stood out as being particularly focused on these use cases.

BLUEDOT

BLUEDOT was showcasing a few different solutions for improving QoE in the encoding and processing stage. Their DeepField-SR video super resolution product uses a proprietary deep neural network to upscale video up to 4K resolution, leveraging FPGAs. They were also showing AI-driven perceptual quality optimized video encoding.

mv-hevc - Bitmovin
BLUEDOT’s AI-driven perceptual quality optimization- image source: blue-dot.io

Twelve Labs

Twelve Labs was featuring their multimodal AI for Media & Entertainment workflows, aiming to bring human-like understanding to video content. They use both video and audio information to inform object and event detection and indexing.  This enables you to easily find moments in a video, like when a certain player scores or when a product is mentioned. They also power generative text descriptions of videos and clips. Their solution seemed more flexible than others I saw and can be integrated into media asset management systems, editing software or OTT streaming workflows.

Conference Sessions and Presentations

Beyond the Hype: A Critical look at AI in Video Streaming

In this session, as the title suggests, Jan Ozer took a close look at the current state of AI applications for video streaming workflows. He conducted several interviews with executives and product leaders ahead of NAB and shared his notes and links to the full interviews. He also called out a few times that many of the companies featured, including Bitmovin, have been researching and working on AI-powered video solutions for several years now, even before the current wave of hype. He shared Bitmovin’s new Analytics session interpreter and our Super Resolution capabilities, which you can hear more about in his interview with our VP of Product, Reinhard Grandl.

Jan Ozer’s interview with Bitmovin’s Reinhard Grandl for his Beyond the Hype NAB presentation

Some other things that stood out for me included Interra Systems’ BATON Captions, which uses natural language processing to break text in a more natural, human readable way. This is a small, subtle feature that can really make a big difference in improving accessibility and the viewer experience, that I haven’t heard anyone else focus on. DeepRender also caught my attention with their claims of an AI-based video codec that will have 45% better compression than VVC by the end of 2024. That’s a really bold claim and I’ll be watching to see if they live up to the hype. Video of the session is available here, thanks to Dan Rayburn and the Streaming Summit.

Running OpenAI’s Whisper Automatic Speech Recognition on a Live Video Transcoding Server

This was a joint presentation led by NETINT’s COO Alex Liu and Ampere’s Chief Evangelist Sean Varley. They presented a practical demo of real-time live transcoding and subtitling using NETINT’s T1U Video Processing Unit (VPU) together with Ampere’s Altra Max CPU running OpenAI Whisper. The NETINT VPU is capable of creating dozens of simultaneous adaptive bitrate outputs with H.264, H.265 and AV1 codecs. The Ampere processor was being positioned as a more environmentally-friendly option for AI inference workflows, consuming less power than similarly capable GPUs. While there were some hiccups with the in-room A/V system, the live captioning demo was impressive and worked very well. Video of the session is available here, again thanks to Dan Rayburn and the Streaming Summit.

mv-hevc - Bitmovin
Sean Varley and Alex Liu presenting NETINT and Ampere’s Live transcoding and subtitling workflow at NAB 2024

Leveraging Azure AI for Media Production and Content Monetization Workflows

Microsoft’s Andy Beach and MediaKind’s Amit Tank led this discussion and showcase of using genAI in media and entertainment workflows. They discussed how AI can help with each part of the production and delivery workflow to boost monetization. This included things like brand detection, contextual ad placements, metadata automation, translations, captioning and personalization. One area they discussed that I hadn’t heard anyone else talk about was using AI for content localization, not just for language translation via captions and dubbing, but for compliance with local and regional norms and in some cases regulations. For example, some areas and countries may prefer or even require removal or censorship of things like alcohol and drug use or guns and excessive violence, so AI can help automate content preparation in different ways for a global audience. They also shared their own personal “most-used” AI applications, which included Microsoft’s Copilot and related AI add-ons to Teams and other Microsoft products.

mv-hevc - Bitmovin
Video AI use cases across the media supply chain, presented by Microsoft and MediaKind at NAB 2024

Did you see an interesting or innovative use of AI at NAB that wasn’t mentioned here? Please let us know in the comments!

The post NAB Video AI Highlights appeared first on Bitmovin.

]]>
https://bitmovin.com/blog/nab-video-ai/feed/ 1 Integrating AI in Video Analytics: Interview with Reinhard Grandl from Bitmovin nonadult
AI-powered Video Super Resolution and Remastering https://bitmovin.com/blog/ai-video-super-resolution/ https://bitmovin.com/blog/ai-video-super-resolution/#respond Fri, 12 Apr 2024 15:18:37 +0000 https://bitmovin.com/?p=279444 AI has been the hot buzz word in tech the past couple of years and we’re starting to see more and more practical applications for video emerging from the hype, like automatic closed-captioning and language translation, automated descriptions and summaries, and AI video Super Resolution upscaling. Bitmovin has especially focused on how AI can provide...

The post AI-powered Video Super Resolution and Remastering appeared first on Bitmovin.

]]>
AI has been the hot buzz word in tech the past couple of years and we’re starting to see more and more practical applications for video emerging from the hype, like automatic closed-captioning and language translation, automated descriptions and summaries, and AI video Super Resolution upscaling. Bitmovin has especially focused on how AI can provide value for our customers, releasing our AI Analytics Session Interpreter earlier this year and we’re looking closer at several other areas of the end-to-end video workflow.

We’re very proud of how our encoder maintains the visual quality of the source files, while significantly reducing the amount of data used, but now we’re exploring how we can actually improve on the quality of the source file for older and standard definition content. Super Resolution implementations have come a long way in the past few years and have the potential to give older content new life and make it look amazing on Ultra-High Definition screens. Keep reading to learn about Bitmovin’s progress and results. 

What is video Super Resolution and how does it work? 

Super Resolution refers to the process of enhancing the quality or increasing the resolution of an image or video beyond its original resolution. The original methods of upscaling images and video involved upsampling by using mathematical functions like bilinear and bicubic interpolation to predict new data points in between sampled data points. Some techniques used multiple lower-resolution images or video frames to create a composite higher resolution image or frame. Now AI and machine learning (ML) based methods involve training deep neural networks (DNNs) with large libraries of low and high-resolution image pairs. The networks learn to map the differences between the pairs, and after enough training they are able to accurately generate a high-resolution image from a lower-resolution one. 

Bitmovin’s AI video Super Resolution exploration and testing

Super Resolution upscaling is something that Bitmovin has been investigating and testing with customers for several years now. We published a 3-part deep dive back in 2020 that goes into detail about the principles behind Super Resolution, how it can be incorporated into video workflows and the practical applications and results. We won’t fully rehash those posts here, so check them out if you’re interested in the details. But one of the conclusions we came to back then, was that Super Resolution was an especially well-suited application for machine learning techniques. This is even more true now, as GPUs have gotten exponentially more powerful over the past 4 years, while becoming more affordable and accessible as cloud resources. 

graph showing 1000x ai compute improvement in 8 years for NVIDIA GPUs that are used for AI video super resolution
Nvidia’s GPU computation capabilities over the last 8 years – source: Nvidia GTC 2024 keynote 

ATHENA Super Resolution research

Bitmovin’s ATHENA research lab partner has also been looking into various AI video Super Resolution approaches. In a proposed method called DeepStream, they demonstrated how a DNN enhancement-layer could be included with a stream to perform Super Resolution upscaling on playback devices with capable GPUs. The results showed this method could save ~35% bitrate while delivering equivalent quality. See this link for more detail. 

mv-hevc - Bitmovin

Other Super Resolution techniques the ATHENA team has looked at involve upscaling on mobile devices that typically can’t take advantage of DNNs due to lack of processing power and power consumption/battery concerns. Lightweight Super Resolution networks specifically tailored for mobile devices like LiDeR and SR-ABR Net have shown positive early outcomes and performance. 

AI-powered video enhancement with Bitmovin partner Pixop

Bitmovin partner Pixop specializes in AI and ML video enhancement and upscaling. They’re also cloud native and fellow members of NVIDIA’s Inception Startup Program. They offer several AI-powered services and filters including restoration, Super Resolution upscaling, denoising, deinterlacing, film grain and frame rate conversion that automate tedious processes that used to be painstaking and time consuming. We’ve found them to be very complementary to Bitmovin’s VOD Encoding and have begun trials with Bitmovin customers. 

One application we’re exploring is digital remastering of historic content. We’ve been able to take lower resolution, grainy and generally lower quality content (by today’s standards) through Pixop’s upscaling and restoration, with promising results. The encoded output was not only a higher resolution, but also the application of cropping, graining and color correction resulted in a visually more appealing result, allowing our customer to re-monetize their aged content. The image below shows a side-by-side comparison of remastered content with finer details.

mv-hevc - Bitmovin
Side-by-side comparison of AI remastered content

Interested in giving your older content new life with the power of AI video Super Resolution? Get in touch here.

Related Links

Blog: Super Resolution Tech Deep Dive Part 1

Blog: Super Resolution Tech Deep Dive Part 2

Blog: Super Resolution Tech Deep Dive Part 3

Blog: AI Video Research

ATHENA research lab – Super Resolution projects and publications

pixop.com

The post AI-powered Video Super Resolution and Remastering appeared first on Bitmovin.

]]>
https://bitmovin.com/blog/ai-video-super-resolution/feed/ 0