video research – Bitmovin

ATHENA’s first 5 years of research and innovation

Andy Francis — Mon, 19 Aug 2024 02:17:57 +0000

Since forming in October 2019, the Christian Doppler Laboratory ATHENA at Universität Klagenfurt, run by Bitmovin co-founder Dr. Christian Timmerer, has been advancing research and innovation for adaptive bitrate (ABR) streaming technologies. Over the past five years, the lab has addressed critical challenges in video streaming from encoding and delivery to playback and end-to-end quality of experience. They are breaking new ground using edge computing, machine learning, neural networks and generative AI for video applications, contributing significantly to both academic knowledge and industry applications as Bitmovin’s research partner.

In this blog, we’ll take a look at the highlights of the ATHENA lab’s work over the past five years and its impact on the future of the streaming industry.

Publications

ATHENA has made its mark with high-impact publications on the topics of multimedia, signal processing, and computer networks. Their research has been featured in prestigious journals such as IEEE Communications Surveys & Tutorials and IEEE Transactions on Multimedia. With 94 papers published or accepted by the time of the 5-year evaluation, the lab has established itself as a leader in video streaming research.

ATHENA also contributed to reproducibility in research. Their open source tools Video Complexity Analyzer and LLL-CAdViSE have already been used by Bitmovin and others in the industry. Their open, multi-codec UHD dataset enables research and development of multi-codec playback solutions for 8K video.

ATHENA has also looked at applications of AI in video coding and streaming, something that will become more of a focus over the next two years. You can read more about ATHENA’s AI video research in this blog post.

Patents

But it’s not all just theoretical research. The ATHENA lab has successfully translated its findings into practical solutions, filing 16 invention disclosures and 13 patent applications. As of publication, 6 patents have been granted:

Workflow diagram for Fast Multi-Rate Encoding using convolutional neural networks. More detail available here.

PhDs

ATHENA has also made an educational impact, successfully guiding the inaugural cohort of seven PhD students to their successful dissertation defenses, with research topics ranging from edge computing in video streaming to machine learning applications in video coding.

Dr. Alireza Erfanian: “Optimizing QoE and Latency of Video Streaming using Edge Computing and In-Network Intelligence”, May 25, 2023
Dr. Ekrem Çetinkaya: “Video Coding Enhancements for HTTP Adaptive Streaming using Machine Learning”, June 7, 2023
Dr. Minh Nguyen: “Policy-driven Dynamic HTTP Adaptive Streaming Player Environment”, June 30, 2023
Dr. Jesús Aguilar Armijo: “Multi-access Edge Computing for Adaptive Video Streaming”, July 10, 2023
Dr. Reza Farahani: “Network-Assisted Delivery of Adaptive Video Streaming Services through CDN, SDN, and MEC”, August 22, 2023
Dr. Vignesh V Menon: “Content-adaptive Video Coding for HTTP Adaptive Streaming”, January 15, 2024
Dr. Babak Taraghi, “End-to-end Quality of Experience Evaluation for HTTP Adaptive Streaming”, July 10, 2024

There are also two postdoctoral scholars in the lab who have made significant contributions and progress.

Dr. Hadi Amirpour, “Video Coding for Efficient HTTP Adaptive Streaming”, February 8, 2024
Dr. Farzad Tashtarian, “How to Optimize Dynamic Adaptive Video Streaming? Challenges and Solutions”, February 27, 2023 & “End-to-End Adaptive Video Streaming Optimization”, June 26, 2024

Practical applications with Bitmovin

As Bitmovin’s academic partner, ATHENA plays a critical role in developing and enhancing technologies that can differentiate our streaming solutions. As ATHENA’s company partner, Bitmovin helps guide and test practical applications of the research, with regular check-ins for in-depth discussions about new innovations and potential technology transfers. The collaboration has resulted in several advancements over the years, including recent projects like CAdViSE and WISH ABR.

CAdViSE

CAdViSE (Cloud based Adaptive Video Streaming Evaluation) is a framework for automated testing of media players. It allows you to test how different players and ABR configurations perform and react to fluctuations in different network parameters. Bitmovin is using CAdViSE to evaluate the performance of different custom ABR algorithms. The code is available in this github repo.

WISH ABR

WISH stands for Weighted Sum model for HTTP Adaptive Streaming and it allows for customization of ABR logic for different devices and applications. WISH’s logic is based on a model that weighs bandwidth, buffer and quality costs for playing back a segment. By setting weights for the importance of those metrics, you create a custom ABR algorithm, optimized for your content and use case. You can learn more about WISH ABR in this blog post.

Decision process for WISH ABR, weighing data/bandwidth cost, buffer cost, and quality cost of each segment.

Project spinoffs

The success of ATHENA has led to three spinoff projects:.

APOLLO

APOLLO is funded by the Austrian Research Promotion Agency FFG and is a cooperative project between Bitmovin and Alpen-Adria-Universität Klagenfurt. The main objective of APOLLO is to research and develop an intelligent video platform for HTTP adaptive streaming which provides distribution of video transcoding across large and small-scale computing environments, using AI and ML techniques for the actual video distribution.

GAIA

GAIA is also funded by the Austrian Research Promotion Agency FFG and is a cooperative project between Bitmovin and Alpen-Adria-Universität Klagenfurt. The GAIA project researches and develops a climate-friendly adaptive video streaming platform that provides complete energy awareness and accountability along the entire delivery chain. It also aims to reduce energy consumption and GHG emissions through advanced analytics and optimizations on all phases of the video delivery chain.

SPIRIT

SPIRIT (Scalable Platform for Innovations on Real-time Immersive Telepresence) is an EU Horizon Europe-funded innovation action. It brings together cutting-edge companies and universities in the field of telepresence applications with advanced and complementary expertise in extended reality (XR) and multimedia communications. SPIRIT’s mission is to create Europe’s first multisite and interconnected framework capable of supporting a wide range of application features in collaborative telepresence.

What’s next

Over the next two years, the ATHENA project will focus on advancing deep neural network and AI-driven techniques for image and video coding. This work will include making video coding more energy- and cost-efficient, exploring immersive formats like volumetric video and holography, and enhancing QoE while being mindful of energy use. Other focus areas include AI-powered, energy-efficient live video streaming and generative AI applications for adaptive streaming.

Get in touch or let us know in the comments if you’d like to learn more about Bitmovin and ATHENA’s research and innovation, AI or sustainability related projects.

The post ATHENA’s first 5 years of research and innovation appeared first on Bitmovin.

PhD video research: From the ATHENA lab to Bitmovin products

Andy Francis — Fri, 10 Nov 2023 18:16:47 +0000

Introduction

The story of Bitmovin began with video research and innovation back in 2012, when our co-founders Stefan Lederer and Christopher Mueller were students at Alpen-Adria-Universität (AAU) Klagenfurt. Together with their professor Dr. Christian Timmerer, the three co-founded Bitmovin in 2013, with their research providing the foundation for Bitmovin’s groundbreaking MPEG-DASH player and Per-Title Encoding. Five years later in 2018, a joint project between Bitmovin and AAU called ATHENA was formed, with a new laboratory and research program that would be led by Dr. Timmerer. The aim of ATHENA was to research and develop new approaches, tools and evaluations for all areas of HTTP adaptive streaming, including encoding, delivery, playback and end-to-end quality of experience (QoE). Bitmovin could then take advantage of the knowledge gained to further innovate and enhance its products and services. In the late spring and summer of 2023, the first cohort of ATHENA PhD students completed their projects and successfully defended their dissertations. This post will highlight their work and its potential applications.

Bitmovin co-founders Stefan Lederer, Christopher Mueller, and Christian Timmerer celebrating the opening of the Christian Doppler ATHENA Laboratory with Martin Gerzabek and Ulrike Unterer from the Christian Doppler Research Association. (Photo: Daniel Waschnig)

Video Research Projects

Optimizing QoE and Latency of Live Video Streaming Using Edge Computing and In-Network Intelligence

Dr. Alireza Erfanian

The work of Dr. Erfanian focused on leveraging edge computing and in-network intelligence to enhance the QoE and reduce end-to-end latency in live ABR streaming. The research also addresses improving transcoding performance and optimizing costs associated with running live streaming services and network backhaul utilization.

Optimizing resource utilization – Two new methods ORAVA and OSCAR, utilize edge computing, network function virtualization, and software-defined networking (SDN). At the network’s edge, virtual reverse proxies collect clients’ requests and send them to an SDN controller, which creates a multicast tree to deliver the highest requested bitrate efficiently. This approach minimizes streaming cost and resource utilization while considering delay constraints. ORAVA, a cost-aware approach, and OSCAR, an SDN-based live video streaming method, collectively save up to 65% bandwidth compared to state-of-the-art approaches, reducing OpenFlow commands by up to 78% and 82%, respectively.
Light-Weight Transcoding – These three new approaches utilize edge computing and network function virtualization to significantly improve transcoding efficiency. LwTE is a novel light-weight transcoding approach at the edge that saves time and computational resources by storing optimal results as metadata during the encoding process. It employs store and transcode policies based on popularity, caching popular segments at the edge. CD-LwTE extends LwTE by proposing Cost- and Delay-aware Light-weight Transcoding at the Edge, considering resource constraints, introducing a fetch policy, and minimizing total cost and serving delay for each segment/bitrate. LwTE-Live investigates the cost efficiency of LwTE in live streaming, leveraging the approach to save bandwidth in the backhaul network. Evaluation results demonstrate LwTE processes transcoding at least 80% faster, while CD-LwTE reduces transcoding time by up to 97%, decreases streaming costs by up to 75%, and reduces delay by up to 48% compared to state-of-the-art approaches.

Slides and more detail

Video Coding Enhancements for HTTP Adaptive Streaming using Machine Learning

Dr. Ekrem Çetinkaya

The research of Dr. Çetinkaya involved several applications of machine learning techniques for improving the video coding process across 4 categories:

Fast Multi-Rate Encoding with Machine Learning – These two techniques address the challenge of encoding multiple representations of a video for ABR streaming. FaME-ML utilizes convolutional neural networks to guide encoding decisions, reducing parallel encoding time by 41%. FaRes-ML extends this approach to multi-resolution scenarios, achieving a 46% reduction in overall encoding time while preserving visual quality.
Enhancing Visual Quality on Mobile Devices – These three methods focused on improving visual quality on mobile devices with limited hardware. SR-ABR integrates super-resolution into adaptive bitrate selection, saving up to 43% bandwidth. LiDeR addresses computational complexity, achieving a 428% increase in execution speed while maintaining visual quality. MoViDNN facilitates the evaluation of machine learning solutions for enhanced visual quality on mobile devices.
Light-Field Image Coding with Super-Resolution – This new approach addresses the data size challenge of light field images in emerging media formats. LFC-SASR utilizes super-resolution to reduce data size by 54%, ensuring a more immersive experience while preserving visual quality.
Blind Visual Quality Assessment Using Vision Transformers – A new technique, BQ-ViT, tackles the blind visual quality assessment problem for videos. Leveraging the vision transformer architecture, BQ-ViT achieves a high correlation (0.895 PCC) in predicting video visual quality using only the encoded frames.

Slides and more detail

Policy-driven Dynamic HTTP Adaptive Streaming Player Environment

Dr. Minh Nguyen

The work of Dr. Ngyuen addressed critical issues impacting QoE in adaptive bitrate (ABR) streaming, with four main contributions:

Days of Future Past Plus (DoFP+) – This approach uses HTTP/3 features to enhance QoE by upgrading low-quality segments during streaming sessions, resulting in a 33% QoE improvement and a 16% reduction in downloaded data.
WISH ABR – This is a weighted sum model that allows users to customize their ABR switching algorithm by specifying preferences for parameters like data usage, stall events, and video quality. WISH considers throughput, buffer, and quality costs, enhancing QoE by up to 17.6% and reducing data usage by 36.4%.
WISH-SR – This is an ABR scheme that extends WISH by incorporating a lightweight Convolutional Neural Network (CNN) to improve video quality on high-end mobile devices. It can reduce downloaded data by up to 43% and enhance visual quality with client-side Super Resolution upscaling.
New CMCD Approach – This new method for determining Common Media Client Data (CMCD) parameters, enables the server to generate suitable bitrate ladders based on clients’ device types and network conditions. This approach reduces downloaded data while improving QoE by up to 2.6 times

Slides and more detail

Multi-access Edge Computing for Adaptive Video Streaming

Dr. Jesús Aguilar Armijo

The network plays a crucial role for video streaming QoE and one of the key technologies available on the network side is Multi-access Edge Computing (MEC). It has several key characteristics: computing power, storage, proximity to the clients and access to network and player metrics, that make it possible to deploy mechanisms at the MEC node to assist video streaming.

This thesis of Dr. Aguilar Armijo investigates how MEC capabilities can be leveraged to support video streaming delivery, specifically to improve the QoE, reduce latency or increase savings on storage and bandwidth.

ANGELA Simulator – A new simulator is designed to test mechanisms supporting video streaming at the edge node. ANGELA addresses issues in state-of-the-art simulators by providing access to radio and player metrics, various multimedia content configurations, Adaptive Bitrate (ABR) algorithms at different network locations, and a range of evaluation metrics. Real 4G/5G network traces are used for radio layer simulation, offering realistic results. ANGELA demonstrates a significant simulation time reduction of 99.76% compared to the ns-3 simulator in a simple MEC mechanism scenario.
Dynamic Segment Repackaging at the Edge – The proposal suggests using the Common Media Application Format (CMAF) in the network’s backhaul, performing dynamic repackaging of content at the MEC node to match clients’ requested delivery formats. This approach aims to achieve bandwidth savings in the network’s backhaul and reduce storage costs at the server and edge side. Measurements indicate potential reductions in delivery latency under certain expected conditions.
Edge-Assisted Adaptation Schemes – Leveraging radio network and player metrics at the MEC node, two edge-assisted adaptation schemes are proposed. EADAS improves ABR decisions on-the-fly to enhance clients’ Quality of Experience (QoE) and fairness. ECAS-ML shifts the entire ABR algorithm logic to the edge, managing the tradeoff among bitrate, segment switches, and stalls through machine learning techniques. Evaluations show significant improvements in QoE and fairness for both schemes compared to various ABR algorithms.
Segment Prefetching and Caching at the Edge – Segment prefetching, a technique transmitting future video segments closer to the client before being requested, is explored at the MEC node. Different prefetching policies, utilizing resources and techniques such as Markov prediction, machine learning, transrating, and super-resolution, are proposed and evaluated. Results indicate that machine learning-based prefetching increases average bitrate while reducing stalls and extra bandwidth consumption, offering a promising approach to enhance overall performance.

Slides and more detail

Potential applications for Bitmovin products

The WISH ABR algorithm presented by Dr. Nguyen is already available in the Bitmovin Web Player SDK as of version 8.136.0, which was released in early October 2023. It can be enabled via AdaptationConfig.logic. Use of CMCD metadata is still gaining momentum throughout the industry, but Bitmovin and Akamai have already demonstrated a joint solution and the research above will help improve our implementation.

Bitmovin has experimented with server-side Super Resolution upscaling with some customers, mainly focusing on upscaling SD content to HD for viewing on TVs and larger monitors, but the techniques investigated by Dr. Çetinkaya take advantage of newer models that can extend Super Resolution to the client side on mobile devices. These have the potential to reduce data usage which is especially important to users with limited data plans and bandwidth. They can also improve QoE and visual quality while saving service providers on delivery costs.

Controlling costs has been at or near the top of the list of challenges video developers and streaming service providers have faced over the past couple of years according to Bitmovin’s annual Video Developer Report. This trend will likely continue into 2024 and the resource management and transcoding efficiency improvements developed by Dr. Erfanian will help optimize and reduce operational costs for Bitmovin and its services.

Edge computing is becoming more mainstream, with companies like Bitmovin partners Videon and Edgio delivering new applications that take advantage of available compute resources closer to the end user. The contributions developed by Dr. Aguilar Armijo address different facets of content delivery and provide a comprehensive approach to optimizing video streaming in edge computing environments. This has the potential to provide more actionable analytics data and enable more intelligent and robust adaptation during challenging network conditions.

Conclusion

Bitmovin was born from research and innovation and 10 years later is still breaking new ground. We were honored to receive a Technology & Engineering Emmy Award for our efforts and remain committed to improving every part of the streaming experience. Whether it’s taking advantage of the latest machine learning capabilities or developing novel approaches for controlling costs, we’re excited for what the future holds. We’re also grateful for all of the researchers, engineers, technology partners and customers who have contributed along the way and look forward to the next 10 years of progress and innovation.

The post PhD video research: From the ATHENA lab to Bitmovin products appeared first on Bitmovin.

video research – Bitmovin

ATHENA’s first 5 years of research and innovation

Table of Contents

Publications

Patents

PhDs

Practical applications with Bitmovin

Project spinoffs

What’s next

PhD video research: From the ATHENA lab to Bitmovin products

Table of Contents

Introduction

Video Research Projects

Optimizing QoE and Latency of Live Video Streaming Using Edge Computing and In-Network Intelligence

Dr. Alireza Erfanian

Video Coding Enhancements for HTTP Adaptive Streaming using Machine Learning

Dr. Ekrem Çetinkaya

Policy-driven Dynamic HTTP Adaptive Streaming Player Environment

Dr. Minh Nguyen

Multi-access Edge Computing for Adaptive Video Streaming

Dr. Jesús Aguilar Armijo

Potential applications for Bitmovin products

Conclusion