MPEG Meeting – Bitmovin

144th MPEG Meeting Takeaways: Understanding Quality Impacts of Learning-based Codecs and Enhancing Green Metadata

Christian Timmerer — Sun, 07 Jan 2024 21:01:46 +0000

Preface

Bitmovin has been “Shaping the Future of Video” for over 10 years now and in addition to our own innovations, we’ve been actively taking part in standardization activities to improve the quality of video technologies for the wider industry. I have been a member and attendant of the Moving Pictures Experts Group for 15+ years and have been documenting the progress since early 2010. Recently, we’ve been working on several new initiatives including the use of learning-based codecs and enhancing support for more energy-efficient media consumption.

The 144th MPEG meeting highlights

The 144th MPEG meeting was held in Hannover, Germany! For those interested, the press release with all the details is available. It’s always great to see and hear about progress being made in person.

Attendees of the 144th MPEG meeting in Hannover, Germany.

The main outcome of this meeting is as follows:

MPEG issues Call for Learning-Based Video Codecs for Study of Quality Assessment
MPEG evaluates Call for Proposals on Feature Compression for Video Coding for Machines
MPEG progresses ISOBMFF-related Standards for the Carriage of Network Abstraction Layer Video Data
MPEG enhances the Support of Energy-Efficient Media Consumption
MPEG ratifies the Support of Temporal Scalability for Geometry-based Point Cloud Compression
MPEG reaches the First Milestone for the Interchange of 3D Graphics Formats
MPEG announces Completion of Coding of Genomic Annotations

This post will focus on MPEG Systems-related standards and visual quality assessment. As usual, the column will end with an update on MPEG-DASH.

Visual Quality Assessment

MPEG does not create standards in the visual quality assessment domain. However, it conducts visual quality assessments for its standards during various stages of the standardization process. For instance, it evaluates responses to call for proposals, conducts verification tests of its final standards, and so on.

MPEG Visual Quality Assessment (AG 5) issued an open call to study quality assessment for learning-based video codecs. AG 5 has been conducting subjective quality evaluations for coded video content and studying their correlation with objective quality metrics. Most of these studies have focused on the High Efficiency Video Coding (HEVC) and Versatile Video Coding (VVC) standards. To facilitate the study of visual quality, MPEG maintains the Compressed Video for the study of Quality Metrics (CVQM) dataset.

With the recent advancements in learning-based video compression algorithms, MPEG is now studying compression using these codecs. It is expected that reconstructed videos compressed using learning-based codecs will have different types of distortion compared to those induced by traditional block-based motion-compensated video coding designs. To gain a deeper understanding of these distortions and their impact on visual quality, MPEG has issued a public call related to learning-based video codecs. MPEG is open to inputs in response to the call and will invite responses that meet the call’s requirements to submit compressed bitstreams for further study of their subjective quality and potential inclusion into the CVQM dataset.

Considering the rapid advancements in the development of learning-based video compression algorithms, MPEG will keep this call open and anticipates future updates to the call.

Interested parties are kindly requested to contact the MPEG AG 5 Convenor Mathias Wien (wien@lfb.rwth- aachen.de) and submit responses for review at the 145th MPEG meeting in January 2024. Further details are given in the call, issued as AG 5 document N 104 and available from the mpeg.org website.

Learning-based data compression (e.g., for image, audio, video content) is a hot research topic. Research on this topic relies on datasets offering a set of common test sequences, sometimes also common test conditions, that are publicly available and allow for comparison across different schemes. MPEG’s Compressed Video for the study of Quality Metrics (CVQM) dataset is such a dataset, available here, and ready to be used also by researchers and scientists outside of MPEG. The call mentioned above is open for everyone inside/outside of MPEG and allows researchers to participate in international standards efforts (note: to attend meetings, one must become a delegate of a national body).

Bitmovin and the ATHENA research lab have been working together on ML-based enhancements to boost visual quality and improve QoE. You can read more about our published research in this blog post.

At the 144th MPEG meeting, MPEG Systems (WG 3) produced three news-worthy items as follows:

Progression of ISOBMFF-related standards for the carriage of Network Abstraction Layer (NAL) video data.
Enhancement of the support of energy-efficient media consumption.
Support of temporal scalability for geometry-based Point Cloud Compression (PPC).

ISO/IEC 14496-15, a part of the family of ISOBMFF-related standards, defines the carriage of Network Abstraction Layer (NAL) unit structured video data such as Advanced Video Coding (AVC), High Efficiency Video Coding (HEVC), Versatile Video Coding (VVC), Essential Video Coding (EVC), and Low Complexity Enhancement Video Coding (LCEVC). This standard has been further improved with the approval of the Final Draft Amendment (FDAM), which adds support for enhanced features such as Picture-in-Picture (PiP) use cases enabled by VVC.

In addition to the improvements made to ISO/IEC 14496-15, separately developed amendments have been consolidated in the 7th edition of the standard. This edition has been promoted to Final Draft International Standard (FDIS), marking the final milestone of the formal standard development.

Another important standard in development is the 2nd edition of ISO/IEC14496-32 (file format reference software and conformance). This standard, currently at the Committee Draft (CD) stage of development, is planned to be completed and reach the status of Final Draft International Standard (FDIS) by the beginning of 2025. This standard will be essential for industry professionals who require a reliable and standardized method of verifying the conformance of their implementation.

MPEG Systems (WG 3) also promoted ISO/IEC 23001-11 (energy-efficient media consumption (green metadata)) Amendment 1 to Final Draft Amendment (FDAM). This amendment introduces energy-efficient media consumption (green metadata) for Essential Video Coding (EVC) and defines metadata that enables a reduction in decoder power consumption. At the same time, ISO/IEC 23001-11 Amendment 2 has been promoted to the Committee Draft Amendment (CDAM) stage of development. This amendment introduces a novel way to carry metadata about display power reduction encoded as a video elementary stream interleaved with the video it describes. The amendment is expected to be completed and reach the status of Final Draft Amendment (FDAM) by the beginning of 2025.

Finally, MPEG Systems (WG 3) promoted ISO/IEC 23090-18 (carriage of geometry-based point cloud compression data) Amendment 1 to Final Draft Amendment (FDAM). This amendment enables the compression of a single elementary stream of point cloud data using ISO/IEC 23090-9 (geometry-based point cloud compression) and storing it in more than one track of ISO Base Media File Format (ISOBMFF)-based files. This enables support for applications that require multiple frame rates within a single file and introduces a track grouping mechanism to indicate multiple tracks carrying a specific temporal layer of a single elementary stream separately.

MPEG Systems usually provides standards on top of existing compression standards, enabling efficient storage and delivery of media data (among others). Researchers may use these standards (including reference software and conformance bitstreams) to conduct research in the general area of multimedia systems (cf. ACM MMSys) or, specifically on green multimedia systems (cf. ACM GMSys).

Enhancements to green metadata are welcome and necessary additions to the toolkit for everyone working on reducing the carbon footprint of video streaming workflows. Bitmovin and the GAIA project have been conducting focused research in this area for over a year now and through testing, benchmarking and developing new methods, hope to significantly improve our industry’s environmental sustainability. You can read more about our progress in this report.

MPEG-DASH Updates

The current status of MPEG-DASH is shown in the figure below with only minor updates compared to the last meeting.

MPEG-DASH Status, October 2023.

In particular, the 6th edition of MPEG-DASH is scheduled for 2024 but may not include all amendments under development. An overview of existing amendments can be found in the blog post from the last meeting. Current amendments have been (slightly) updated and progressed toward completion in the upcoming meetings. The signaling of haptics in DASH has been discussed and accepted for inclusion in the Technologies under Consideration (TuC) document. The TuC document comprises candidate technologies for possible future amendments to the MPEG-DASH standard and is publicly available here.

MPEG-DASH has been heavily researched in the multimedia systems, quality, and communications research communities. Adding haptics to MPEG-DASH would provide another dimension worth considering within research, including, but not limited to, performance aspects and Quality of Experience (QoE).

The 145th MPEG meeting will be online from January 22-26, 2024. Click here for more information about MPEG meetings and their developments.

Want to learn more about the latest research from the ATHENA lab and its potential applications? check out this post summarizing the projects from the first cohort of finishing PhD candidates.

Notes and highlights from previous MPEG meetings can be found here.

The post 144th MPEG Meeting Takeaways: Understanding Quality Impacts of Learning-based Codecs and Enhancing Green Metadata appeared first on Bitmovin.

143rd MPEG Meeting Takeaways: Green metadata support added to VVC for improved energy efficiency

Christian Timmerer — Tue, 22 Aug 2023 15:11:18 +0000

Preface

Bitmovin is a proud member and contributor to several organizations working to shape the future of video, including the Moving Pictures Expert Group (MPEG), where I along with a few senior developers at Bitmovin are active members. Personally, I have been a member and attendant of MPEG for 20+ years and have been documenting the progress since early 2010. Today, we’re working hard to further improve the capabilities and energy efficiency of the industry’s newest standards, such as VVC, while maintaining and modernizing older codecs like HEVC and AVC to take advantage of advancements in neural network post-processing.

The 143rd MPEG Meeting Highlights

The official press release of the 143rd MPEG meeting can be found here and comprises the following items:

MPEG finalizes the Carriage of Uncompressed Video and Images in ISOBMFF
MPEG reaches the First Milestone for two ISOBMFF Enhancements
MPEG ratifies Third Editions of VVC and VSEI
MPEG reaches the First Milestone of AVC (11th Edition) and HEVC Amendment
MPEG Genomic Coding extended to support Joint Structured Storage and Transport of Sequencing Data, Annotation Data, and Metadata
MPEG completes Reference Software and Conformance for Geometry-based Point Cloud Compression

In this report, I’d like to focus on ISOBMFF and video codecs and, as always, I will conclude with an update on MPEG-DASH.

ISOBMFF Enhancements

The ISO Base Media File Format (ISOBMFF) supports the carriage of a wide range of media data such as video, audio, point clouds, haptics, etc., which has now been further extended to uncompressed video and images.

ISO/IEC 23001-17 – Carriage of uncompressed video and images in ISOBMFF – specifies how uncompressed 2D image and video data is carried in files that comply with the ISOBMFF family of standards. This encompasses a range of data types, including monochromatic and colour data, transparency (alpha) information, and depth information. The standard enables the industry to effectively exchange uncompressed video and image data while utilizing all additional information provided by the ISOBMFF, such as timing, color space, and sample aspect ratio for interoperable interpretation and/or display of uncompressed video and image data.

ISO/IEC 14496-15, formerly known as MP4 file format (and based on ISOBMFF), provides the basis for “network abstraction layer (NAL) unit structured video coding formats” such as AVC, HEVC, and VVC. The current version is the 6th edition, which has been amended to support neural-network post-filter supplemental enhancement information (SEI) messages. This amendment defines the carriage of the neural-network post-filter characteristics (NNPFC) SEI messages and the neural-network post-filter activation (NNPFA) SEI messages to enable the delivery of (i) a base post-processing filter and (ii) a series of neural network updates synchronized with the input video pictures/frames.

Bitmovin has supported ISOBFF in our encoding pipeline and API from day 1 and will continue to do so. For more details and information about container file formats, check out this blog.

Video Codec Enhancements

MPEG finalized the specifications of the third editions of the Versatile Video Coding (VVC, ISO/IEC 23090-3) and the Versatile Supplemental Enhancement Information (VSEI, ISO/IEC 23002-7) standards. Additionally, MPEG issued the Committee Draft (CD) text of the eleventh edition of the Advanced Video Coding (AVC, ISO/IEC 14496-10) standard and the Committee Draft Amendment (CDAM) text on top of the High Efficiency Video Coding standard (HEVC, ISO/IEC 23008-2).

These SEI messages include two systems-related SEI messages, (a) one for signaling of green metadata as specified in ISO/IEC 23001-11 and (b) the other for signaling of an alternative video decoding interface for immersive media as specified in ISO/IEC 23090-13. Furthermore, the neural network post-filter characteristics SEI message and the neural-network post-processing filter activation SEI message have been added to AVC, HEVC, and VVC.

The two SEI messages for describing and activating post-filters using neural network technology in video bitstreams could, for example, be used for reducing coding noise, spatial and temporal upsampling (i.e., super-resolution and frame interpolation), color improvement, or general denoising of the decoder output. The description of the neural network architecture itself is based on MPEG’s neural network representation standard (ISO/IEC 15938 17). As results from an exploration experiment have shown, neural network-based post-filters can deliver better results than conventional filtering methods. Processes for invoking these new post-filters have already been tested in a software framework and will be made available in an upcoming version of the VVC reference software (ISO/IEC 23090-16).

Bitmovin and our partner ATHENA research lab have been exploring several applications of neural networks to improve the quality of experience for video streaming services. You can read the summaries with links to full publications in this blog post.

The latest MPEG-DASH Update

The current status of MPEG-DASH is depicted in the figure below:

The latest edition of MPEG-DASH is the 5th edition (ISO/IEC 23009-1:2022) which is publicly/freely available here. There are currently three amendments under development:

ISO/IEC 23009-1:2022 Amendment 1: Preroll, nonlinear playback, and other extensions. This amendment has been ratified already and is currently being integrated into the 5th edition of part 1 of the MPEG-DASH specification.
ISO/IEC 23009-1:2022 Amendment 2: EDRAP streaming and other extensions. EDRAP stands for Extended Dependent Random Access Point and at this meeting the Draft Amendment (DAM) has been approved. EDRAP increases the coding efficiency for random access and has been adopted within VVC.
ISO/IEC 23009-1:2022 Amendment 3: Segment sequences for random access and switching. This amendment is at Committee Draft Amendment (CDAM) stage, the first milestone of the formal standardization process. This amendment aims at improving tune-in time for low latency streaming.

Additionally, MPEG Technologies under Consideration (TuC) comprises a few new work items, such as content selection and adaptation logic based on device orientation and signaling of haptics data within DASH.

Finally, part 9 of MPEG-DASH — redundant encoding and packaging for segmented live media (REAP) — has been promoted to Draft International Standard (DIS). It is expected to be finalized in the upcoming MPEG meetings.

Bitmovin recently announced its new Player Web X which was reimagined and built from the ground up with structured concurrency. You can read more about it and why structured concurrency matters in this recent blog series.

The next meeting will be held in Hannover, Germany, from October 16-20, 2023. Further details can be found here.

Click here for more information about MPEG meetings and their developments.

Are you currently using the ISOBMFF or CMAF as a container format for fragmented MP4 files? Do you prefer hard-parted fMP4 or single-file MP4 with byte-range addressing? Vote in our poll and check out the Bitmovin Community to learn more.

Looking for more info on streaming formats and codecs? Here are some useful resources:

[Blog] VVC: Benefits, Supported Devices, and Bitmovin’s Implementation
[Blog] Live Low Latency Streaming Tech Deep Dive
[Demo] Low Latency ABR player demo

The post 143rd MPEG Meeting Takeaways: Green metadata support added to VVC for improved energy efficiency appeared first on Bitmovin.

142nd MPEG Meeting Takeaways: MPEG issues Call for Proposals for Feature Coding for Machines

Christian Timmerer — Wed, 24 May 2023 14:49:18 +0000

Preface

Bitmovin is a proud member and contributor to several organizations working to shape the future of video, none for longer than the Moving Pictures Expert Group (MPEG), where I along with a few senior developers at Bitmovin are active members. Personally, I have been a member and attendant of MPEG for 20+ years and have been documenting the progress since early 2010. Today, we’re working hard to further improve the capabilities and efficiency of the industry’s newest standards, while exploring the potential applications of machine learning and neural networks.

The 142nd MPEG Meeting – MPEG issues Call for Proposals for Feature Coding for Machines

The official press release of the 142nd MPEG meeting can be found here and comprises the following items:

MPEG issues Call for Proposals for Feature Coding for Machines
MPEG finalizes the 9th Edition of MPEG-2 Systems
MPEG reaches the First Milestone for Storage and Delivery of Haptics Data
MPEG completes 2nd Edition of Neural Network Coding (NNC)
MPEG completes Verification Test Report and Conformance and Reference Software for MPEG Immersive Video
MPEG finalizes work on metadata-based MPEG-D DRC Loudness Leveling

In this report, I’d like to focus on Feature Coding for Machines, MPEG-2 Systems, Haptics, Neural Network Coding (NNC), MPEG Immersive Video, and a brief update about DASH (as usual).

Feature Coding for Machines

At the 142nd MPEG meeting, MPEG Technical Requirements (WG 2) issued a Call for Proposals (CfP) for technologies and solutions enabling efficient feature compression for video coding for machine vision tasks. This work on “Feature Coding for Video Coding for Machines (FCVCM)” aims at compressing intermediate features within neural networks for machine tasks. As applications for neural networks become more prevalent and the neural networks increase in complexity, use cases such as computational offload become more relevant to facilitate widespread deployment of applications utilizing such networks. Initially as part of the “Video Coding for Machines” activity, over the last four years, MPEG has investigated potential technologies for efficient compression of feature data encountered within neural networks. This activity has resulted in establishing a set of ‘feature anchors’ that demonstrate the achievable performance for compressing feature data using state-of-the-art standardized technology. These feature anchors include tasks performed on four datasets.

9th Edition of MPEG-2 Systems

MPEG-2 Systems was first standardized in 1994, defining two container formats: program stream (e.g., used for DVDs) and transport stream. The latter, also known as MPEG-2 Transport Stream (M2TS), is used for broadcast and internet TV applications and services. MPEG-2 Systems has been awarded a Technology and Engineering Emmy® in 2013 and at the 142nd MPEG meeting, MPEG Systems (WG 3) ratified the 9th edition of ISO/IEC 13818-1 MPEG-2 Systems. The new edition includes support for Low Complexity Enhancement Video Coding (LCEVC), the youngest in the MPEG family of video coding standards on top of more than 50 media stream types, including, but not limited to, 3D Audio and Versatile Video Coding (VVC). The new edition also supports new options for signaling different kinds of media, which can aid the selection of the best audio or other media tracks for specific purposes or user preferences. As an example, it can indicate that a media track provides information about a current emergency.

Storage and Delivery of Haptics Data

At the 142nd MPEG meeting, MPEG Systems (WG 3) reached the first milestone for ISO/IEC 23090-32 entitled “Carriage of haptics data” by promoting the text to Committee Draft (CD) status. This specification enables the storage and delivery of haptics data (defined by ISO/IEC 23090-31) in the ISO Base Media File Format (ISOBMFF; ISO/IEC 14496-12). Considering the nature of haptics data composed of spatial and temporal components, a data unit with various spatial or temporal data packets is used as a basic entity like an access unit of audio-visual media. Additionally, an explicit indication of a silent period considering the sparse nature of haptics data, has been introduced in this draft. The standard is planned to be completed, i.e., to reach the status of Final Draft International Standard (FDIS), by the end of 2024.

Neural Network Coding (NNC)

Many applications of artificial neural networks for multimedia analysis and processing (e.g., visual and acoustic classification, extraction of multimedia descriptors, or image and video coding) utilize edge-based content processing or federated training. The trained neural networks for these applications contain many parameters (weights), resulting in a considerable size. Therefore, the MPEG standard for the compressed representation of neural networks for multimedia content description and analysis (NNC, ISO/IEC 15938-17, published in 2022) was developed, which provides a broad set of technologies for parameter reduction and quantization to compress entire neural networks efficiently.

Recently, an increasing number of artificial intelligence applications, such as edge-based content processing, content-adaptive video post-processing filters, or federated training, need to exchange updates of neural networks (e.g., after training on additional data or fine-tuning to specific content). Such updates include changes of the neural network parameters but may also involve structural changes in the neural network (e.g., when extending a classification method with a new class). In scenarios like federated training, these updates must be exchanged frequently, such that much more bandwidth over time is required, e.g., in contrast to the initial deployment of trained neural networks.

The second edition of NNC addresses these applications through efficient representation and coding of incremental updates and extending the set of compression tools that can be applied to both entire neural networks and updates. Trained models can be compressed to at least 10-20% and, for several architectures, even below 3% of their original size without performance loss. Higher compression rates are possible at moderate performance degradation. In a distributed training scenario, a model update after a training iteration can be represented at 1% or less of the base model size on average without sacrificing the classification performance of the neural network. NNC also provides synchronization mechanisms, particularly for distributed artificial intelligence scenarios, e.g., if clients in a federated learning environment drop out and later rejoin.

Verification Test Report and Conformance and Reference Software for MPEG Immersive Video

At the 142nd MPEG meeting, MPEG Video Coding (WG 4) issued the verification test report of ISO/IEC 23090-12 MPEG immersive video (MIV) and completed the development of the conformance and reference software for MIV (ISO/IEC 23090-23), promoting it to the Final Draft International Standard (FDIS) stage.

MIV was developed to support the compression of immersive video content, in which multiple real or virtual cameras capture a real or virtual 3D scene. The standard enables the storage and distribution of immersive video content over existing and future networks for playback with 6 degrees of freedom (6DoF) of view position and orientation. MIV is a flexible standard for multi-view video plus depth (MVD) and multi-planar video (MPI) that leverages strong hardware support for commonly used video formats to compress volumetric video.

ISO/IEC 23090-23 specifies how to conduct conformance tests and provides reference encoder and decoder software for MIV. This draft includes 23 verified and validated conformance bitstreams spanning all profiles and encoding and decoding reference software based on version 15.1.1 of the test model for MPEG immersive video (TMIV). The test model, objective metrics, and other tools are publicly available at https://gitlab.com/mpeg-i-visual.

The latest MPEG-DASH Update

Finally, I’d like to provide a quick update regarding MPEG-DASH, which has a new part, namely redundant encoding and packaging for segmented live media (REAP; ISO/IEC 23009-9). The following figure provides the reference workflow for redundant encoding and packaging of live segmented media.

The reference workflow comprises (i) Ingest Media Presentation Description (I-MPD), (ii) Distribution Media Presentation Description (D-MPD), and (iii) Storage Media Presentation Description (S-MPD), among others; each defining constraints on the MPD and tracks of ISO base media file format (ISOBMFF).

Additionally, the MPEG-DASH Break out Group discussed various technologies under consideration, such as (a) combining HTTP GET requests, (b) signaling common media client data (CMCD) and common media server data (CMSD) in a MPEG-DASH MPD, (c) image and video overlays in DASH, and (d) updates on lower latency.

An updated overview of DASH standards/features can be found in the Figure below.

MPEG-DASH Status – April 2023

The next meeting will be held in Geneva, Switzerland, from July 17-21, 2023. Further details can be found here.

Click here for more information about MPEG meetings and their developments.

Have any thoughts or questions about neural networks or the other updates described above? Check out Bitmovin’s Video Developer Community and join the conversation!

Looking for more info on video streaming formats and codecs? Here are some useful resources:

[Guide] The Definitive Guide to Video Codecs
[Tutorial] Encoding VR and 360 video for Meta Quest Headsets
[Blog] The 20 Best Live Streaming Encoders

The post 142nd MPEG Meeting Takeaways: MPEG issues Call for Proposals for Feature Coding for Machines appeared first on Bitmovin.

139th MPEG Meeting Takeaways: MPEG issues Call for Evidence for Video Coding for Machines

Christian Timmerer — Wed, 24 Aug 2022 14:28:00 +0000

Preface

Bitmovin is a proud member and contributor to several organizations working to shape the future of video, none for longer than the Moving Pictures Expert Group (MPEG), where I along with a few senior developers at Bitmovin are active members. Personally, I have been a member and attendant of MPEG for 15+ years and have been documenting the progress since early 2010. Today, we’re working hard to further improve the capabilities and efficiency of the industry’s newest standards, such VVC, LCEVC, and MIV.

The 139th MPEG Meeting – MPEG issues Call for Evidence to drive the future of computer vision and smart transportation

The past few months of research and progression in the world of video standards setting at MPEG (and Bitmovin alike) have been quite busy and though we didn’t publish a quarterly blog for the 138th MPEG meeting, it’s worth sharing again that MPEG was awarded two Technology & Engineering Emmy® Awards for its MPEG-DASH and Open Font Format standards. The latest developments in the standards space have expectedly been focused around improvements to VVC & LCEVC, however, there have also been recent updates made to CMAF and progress with energy efficiency standards and immersive media codecs. I’ve addressed most of the recent updates. The official press release of the 139th MPEG meeting can be found here and comprises the following items:

MPEG Issues Call for Evidence for Video Coding for Machines (VCM)
MPEG Ratifies the Third Edition of Green Metadata, a Standard for Energy-Efficient Media Consumption
MPEG Completes the Third Edition of the Common Media Application Format (CMAF) by adding Support for 8K and High Frame Rate for High Efficiency Video Coding
MPEG Scene Descriptions adds Support for Immersive Media Codecs
MPEG Starts New Amendment of VSEI containing Technology for Neural Network-based Post Filtering
MPEG Starts New Edition of Video Coding-Independent Code Points Standard
MPEG White Paper on the Third Edition of the Common Media Application Format

In this report, I’d like to focus on VCM, Green Metadata, CMAF, VSEI, and a brief update about DASH (as usual).

Video Coding for Machines (VCM)

MPEG’s exploration work on Video Coding for Machines (VCM) aims at compressing features for machine-performed tasks such as video object detection and event analysis. As neural networks increase in complexity, architectures such as collaborative intelligence, whereby a network is distributed across an edge device and the cloud, become advantageous. With the rise of newer network architectures being deployed amongst a heterogenous population of edge devices, such architectures bring flexibility to systems implementers. Due to such architectures, there is a need to efficiently compress intermediate feature information for transport over wide area networks (WANs). As feature information differs substantially from conventional image or video data, coding technologies and solutions for machine usage could differ from conventional human-viewing-oriented applications to achieve optimized performance. With the rise of machine learning technologies and machine vision applications, the amount of video and images consumed by machines has rapidly grown.

Typical use cases include intelligent transportation, smart city technology, intelligent content management, etc., which incorporate machine vision tasks such as object detection, instance segmentation, and object tracking. Due to the large volume of video data, extracting and compressing the feature from a video is essential for efficient transmission and storage. Feature compression technology solicited in this Call for Evidence (CfE) can also be helpful in other regards, such as computational offloading and privacy protection.

Over the last three years, MPEG has investigated potential technologies for efficiently compressing feature data for machine vision tasks and established an evaluation mechanism that includes feature anchors, rate-distortion-based metrics, and evaluation pipelines. The evaluation framework of VCM depicted below comprises neural network tasks (typically informative) at both ends as well as VCM encoder and VCM decoder, respectively. The normative part of VCM typically includes the bitstream syntax which implicitly defines the decoder whereas other parts are usually left open for industry competition and research.

Further details about the CfP and how interested parties are able to respond can be found in the official press release here.

Green Metadata

MPEG Systems has been working on Green Metadata for the last ten years to enable the adaptation of the client’s power consumption according to the complexity of the bitstream. Many modern implementations of video decoders can adjust their operating voltage or clock speed to adjust the power consumption level according to the required computational power. Thus, if the decoder implementation knows the variation in the complexity of the incoming bitstream, then the decoder can adjust its power consumption level to the complexity of the bitstream. This will allow less energy use in general and extended video playback for the battery-powered devices.

The third edition enables support for Versatile Video Coding (VVC, ISO/IEC 23090-3, a.k.a. ITU-T H.266) encoded bitstreams and enhances the capability of this standard for real-time communication applications and services. While finalizing the support of VVC, MPEG Systems has also started the development of a new amendment to the Green Metadata standard, adding the support of Essential Video Coding (EVC, ISO/IEC 23094-1) encoded bitstreams.

Making video coding and systems sustainable and environmentally-friendly will become a major issue in the years to come, specifically since more and more video services become available. However, we need a holistic approach considering all entities from production to consumption and Bitmovin is committed to contribute its share to these efforts.

Third Edition of Common Media Application Format (CMAF)

The third edition of CMAF adds two new media profiles for High Efficiency Video Coding (HEVC, ISO/IEC 23008-2, a.k.a. ITU-T H.265), namely for (i) 8K and (ii) High Frame Rate (HFR). Regarding the former, the media profile supporting 8K resolution video encoded with HEVC (Main 10 profile, Main Tier with 10 bits per colour component) has been added to the list of CMAF media profiles for HEVC. The profile will be branded as ‘c8k0’ and will support videos with up to 7680×4320 pixels (8K) and up to 60 frames per second. Regarding the latter, another media profile has been added to the list of CMAF media profiles, branded as ‘c8k1’ and supports HEVC encoded video with up to 8K resolution and up to 120 frames per second. Finally, chroma location indication support has been added to the 3rd edition of CMAF.

CMAF is an integral part of the video streaming system and enabler for (live) low-latency streaming. Bitmovin and its co-funded research lab ATHENA significantly contributed to enable (live) low latency streaming use cases through our joint solution with Akamai for chunked CMAF low latency delivery as well as our research projects exploring the challenges of real-world deployments and the best methods to optimize those implementations.

New Amendment for Versatile Supplemental Enhancement Information (VSEI) containing Technology for Neural Network-based Post Filtering

At the 139th MPEG meeting, the MPEG Joint Video Experts Team with ITU-T SG 16 (WG 5; JVET) issued a Committee Draft Amendment (CDAM) text for the Versatile Supplemental Enhancement Information (VSEI) standard (ISO/IEC 23002-7, a.k.a. ITU-T H.274). Beyond the SEI message for shutter interval indication, which is already known from its specification in Advanced Video Coding (AVC, ISO/IEC 14496-10, a.k.a. ITU-T H.264) and High Efficiency Video Coding (HEVC, ISO/IEC 23008-2, a.k.a. ITU-T H.265), and a new indicator for subsampling phase indication which is relevant for variable-resolution video streaming, this new amendment contains two Supplemental Enhancement Information (SEI) messages for describing and activating post filters using neural network technology in video bitstreams. This could reduce coding noise, upsampling, colour improvement, or denoising. The description of the neural network architecture itself is based on MPEG’s neural network coding standard (ISO/IEC 15938-17). Results from an exploration experiment have shown that neural network-based post filters can deliver better performance than conventional filtering methods. Processes for invoking these new post-processing filters have already been tested in a software framework and will be made available in an upcoming version of the Versatile Video Coding (VVC, ISO/IEC 23090-3, a.k.a. ITU-T H.266) reference software (ISO/IEC 23090-16, a.k.a. ITU-T H.266.2).

Neural network-based video processing (incl. coding) is gaining momentum and end user devices are becoming more and more powerful for such complex operations. Bitmovin and its co-funded research lab ATHENA investigated and researched such options; recently proposed LiDeR, a lightweight dense residual network for video super resolution on mobile devices that can compete with other state-of-the-art neural networks, while executing ~300% faster.

The latest MPEG-DASH Update

Finally, I’d like to provide a brief update on MPEG-DASH! At the 139^th MPEG meeting, MPEG Systems issued a new working draft related to Extended Dependent Random Access Point (EDRAP) streaming and other extensions which it will be further discussed during the Ad-hoc Group (AhG) period (please join the dash email list for further details/announcements). Furthermore, Defects under Investigation (DuI) and Technologies under Consideration (TuC) have been updated. Finally, a new part has been added (ISO/IEC 23009-9) which is called encoder and packager synchronization for which also a working draft has been produced. Publicly available documents (if any) can be found here.

An updated overview of DASH standards/features can be found in the Figure below.

The next meeting will be face-to-face in Mainz, Germany from October 24-28, 2022. Further details can be found here.

Click here for more information about MPEG meetings and their developments.

Have any questions about the formats and standards described above? Do you think MPEG is taking the first step toward enabling Skynet and Terminators by advancing video coding for machines? Check out Bitmovin’s Video Developer Community and let us know your thoughts.

Looking for more info on streaming formats and codecs? Here are some useful resources:

[E-Book] Ultimate Guide to Container Formats
[Blog] Live Low Latency Streaming Tech Deep Dive
[Guide] Practical Guide to HDR

The post 139th MPEG Meeting Takeaways: MPEG issues Call for Evidence for Video Coding for Machines appeared first on Bitmovin.

137th MPEG Meeting Takeaways: MPEG Wins Two More Emmy® Awards

Christian Timmerer — Tue, 08 Feb 2022 10:48:47 +0000

Preface

The 137th MPEG Meeting – Immersive Experiences Move Forward

It’s been a long six months of research and progression in the world of video standards-setting – as MPEG (and Bitmovin alike) have had their heads down to run new efficiency experiments to improve codecs such as VVC/h.266 in collaboration with Fraunhofer HHI, I haven’t had the chance to publish one of my quarterly meeting reports for the 136th MPEG meeting. So, this month’s report will cover both the 136th and 137th MPEG Meetings. The latest developments in the standards space have expectedly been focused around improvements to VVC & LCEVC, however, what’s progressing fast than usual are technologies that focus in on both audio & visual immersive experiences.
I’ve addressed most of the recent updates in the official press release of the 137th MPEG meeting can be found here and comprises the following items:

- MPEG Systems Wins Two More Technology & Engineering Emmy® Awards
- MPEG Audio Coding selects 6DoF Technology for MPEG-I Immersive Audio
- MPEG Requirements issues Call for Proposals for Encoder and Packager Synchronization
- MPEG Systems promotes MPEG-I Scene Description to the Final Stage
- MPEG Systems promotes Smart Contracts for Media to the Final Stage
- MPEG Systems further enhanced the ISOBMFF Standard
- MPEG Video Coding completes Conformance and Reference Software for LCEVC
- MPEG Video Coding issues Committee Draft of Conformance and Reference Software for MPEG Immersive Video
- JVET produces Second Editions of VVC & VSEI and finalizes VVC Reference Software
- JVET promotes Tenth Edition of AVC to Final Draft International Standard
- JVET extends HEVC for High-Capability Applications up to 16K and Beyond
- MPEG Genomic Coding evaluated Responses on New Advanced Genomics Features and Technologies

MPEG White Papers

Neural Network Coding (NNC)

Low Complexity Enhancement Video Coding (LCEVC)
MPEG Immersive Video

In this report, I’d like to focus on the Emmy® Awards, video coding updates (AVC, HEVC, VVC, and beyond), and a brief update about DASH (as usual).

MPEG Systems Wins Two More Technology & Engineering Emmy® Awards

MPEG Systems is pleased to report that MPEG is being recognized this year by the National Academy for Television Arts and Sciences (NATAS) with two Technology & Engineering Emmy® Awards, for (i) “standardization of font technology for custom downloadable fonts and typography for Web and TV devices and for (ii) “standardization of HTTP encapsulated protocols”, respectively.
The first of these Emmys is related to MPEG’s Open Font Format (ISO/IEC 14496-22) and the second of these Emmys is related to MPEG Dynamic Adaptive Streaming over HTTP (i.e., MPEG DASH, ISO/IEC 23009). The MPEG DASH standard is the only commercially deployed international standard technology for media streaming over HTTP and it is widely used in many products. MPEG developed the first edition of the DASH standard in 2012 in collaboration with 3GPP and since then has produced four more editions amending the core specification by adding new features and extended functionality. Furthermore, MPEG has developed six other standards as additional “parts” of ISO/IEC 23009 enabling the effective use of the MPEG DASH standards with reference software and conformance testing tools, guidelines, and enhancements for additional deployment scenarios. MPEG DASH has dramatically changed the streaming industry by providing a standard that is widely adopted by various consortia such as 3GPP, ATSC, DVB, and HbbTV, and across different sectors. The success of this standard is due to its technical excellence, large participation of the industry in its development, addressing the market needs, and working with all sectors of industry all under ISO/IEC JTC 1/SC 29 MPEG Systems’ standard development practices and leadership.
These are MPEG’s fifth and sixth Technology & Engineering Emmy® Awards (after MPEG-1 and MPEG-2 together with JPEG in 1996, Advanced Video Coding (AVC) in 2008, MPEG-2 Transport Stream in 2013, and ISO Base Media File Format in 2021) and MPEG’s seventh and eighth overall Emmy® Awards (including the Primetime Engineering Emmy® Awards for Advanced Video Coding (AVC) High Profile in 2008 and High-Efficiency Video Coding (HEVC) in 2017).
Bitmovin and its founders have been actively contributing to the MPEG DASH standard since its inception. My initial blog post dates back to 2010 and the first edition of MPEG DASH was published in 2012. A more detailed MPEG DASH timeline provides many pointers to the Institute of Information Technology (ITEC) at the Alpen-Adria-Universität Klagenfurt and its DASH activities that is now continued within the Christian Doppler Laboratory ATHENA. In the end, the MPEG DASH community of contributors to and users of the standards can be very proud of this achievement only after 10 years of the first edition being published. Thus, also happy 10th birthday MPEG DASH and what a nice birthday gift.

Video Coding Updates

In terms of video coding, there have been many updates across various standards’ projects at the 137th MPEG Meeting.

Advanced Video Coding

Starting with Advanced Video Coding (AVC), the 10th edition of Advanced Video Coding (AVC, ISO/IEC 14496-10 | ITU-T H.264) has been promoted to Final Draft International Standard (FDIS) which is the final stage of the standardization process. Beyond various text improvements, this specifies a new SEI message for describing the shutter interval applied during video capture. This can be variable in video cameras, and conveying this information can be valuable for analysis and post-processing of the decoded video.

High-Efficiency Video Coding

The High-Efficiency Video Coding (HEVC, ISO/IEC 23008-2 | ITU-T H.265) standard has been extended to support high-capability applications. It defines new levels and tiers providing support for very high bit rates and video resolutions up to 16K, as well as defining an unconstrained level. This will enable the usage of HEVC in new application domains, including professional, scientific, and medical video sectors.

Versatile Video Coding

The second editions of Versatile Video Coding (VVC, ISO/IEC 23090-3 | ITU-T H.266) and Versatile supplemental enhancement information messages for coded video bitstreams (VSEI, ISO/IEC 23002-7 | ITU-T H.274) have reached FDIS status. The new VVC version defines profiles and levels supporting larger bit depths (up to 16 bits), including some low-level coding tool modifications to obtain improved compression efficiency with high bit-depth video at high bit rates. VSEI version 2 adds SEI messages giving additional support for scalability, multi-view, display adaptation, improved stream access, and other use cases. Furthermore, a Committee Draft Amendment (CDAM) for the next amendment of VVC was issued to begin the formal approval process to enable linking VVC with the Green Metadata (ISO/IEC 23001-11) and Video Decoding Interface (ISO/IEC 23090-13) standards and add a new unconstrained level for exceptionally high capability applications such as certain uses in professional, scientific, and medical application scenarios. Finally, the reference software package for VVC (ISO/IEC 23090-16) was also completed with its achievement of FDIS status. Reference software is extremely helpful for developers of VVC devices, helping them in testing their implementations for conformance to the video coding specification.

Beyond VVC

The activities in terms of video coding beyond VVC capabilities, the Enhanced Compression Model (ECM 3.1) performance over VTM-11.0 + JVET-V0056 (i.e., VVC reference software) shows an improvement of close to 15% for Random Access Main 10. This is indeed encouraging and, in general, these activities are currently managed within two exploration experiments (EEs). The first is on neural network-based (NN) video coding technology (EE1) and the second is on enhanced compression beyond VVC capability (EE2). EE1 currently plans to further investigate (i) enhancement filters (loop and post) and (ii) super-resolution (JVET-Y2023). It will further investigate selected NN technologies on top of ECM 4 and the implementation of selected NN technologies in the software library, for platform-independent cross-checking and integerization. Enhanced Compression Model 4 (ECM 4) comprises new elements on MRL for intra, various GPM/affine/MV-coding improvements including TM, adaptive intra MTS, coefficient sign prediction, CCSAO improvements, bug fixes, and encoder improvements (JVET-Y2025). EE2 will investigate intra prediction improvements, inter prediction improvements, improved screen content tools, and improved entropy coding (JVET-Y2024).
Bitmovin Encoding supports AVC and HEVC for many years and currently investigates the integration of VVC. Bitmovin Player is coding format agnostic and utilizes the underlying hardware/software platform for decoding but efficiently handles multi-codec use cases.

The latest MPEG-DASH Update

Finally, I’d like to provide a brief update on MPEG-DASH! At the 137th MPEG meeting, MPEG Systems issued a draft amendment to the core MPEG-DASH specification (i.e., ISO/IEC 23009-1) about Extended Dependent Random Access Point (EDRAP) streaming and other extensions which it will be further discussed during the Ad-hoc Group (AhG) period (please join the dash email list for further details/announcements). Furthermore, Defects under Investigation (DuI) and Technologies under Consideration (TuC) are available here.
An updated overview of DASH standards/features can be found in the Figure below.

MPEG DASH Status – January 2022

The next meeting will be again an online meeting in April 2022.
Click here for more information about MPEG meetings and their developments.
Check out the following links for other great reads!
A little lost about the formats and standards described above? Check out some other great educational content to learn more!

The post 137th MPEG Meeting Takeaways: MPEG Wins Two More Emmy® Awards appeared first on Bitmovin.

135th MPEG Meeting Takeaways: MPEG Immersive Video is here

Christian Timmerer — Thu, 05 Aug 2021 15:01:14 +0000

Preface

The 135th MPEG Meeting

The 135th MPEG Meeting was defined by the development of truly “future-like” video experiences. As we gathered for (hopefully) the second to last time in a virtual-only setting ahead of our next meeting in late October, this group of video experts came together to improve what was historically considered the future of virtual content: Immersive Video experiences. As the weekend came to a close we focused on progressing two primary initiatives:

Helping bring VVC to market
Creating a standardized definition for how immersive video experiences (such as multi-view) should be handled during transmission.

As such the working group made significant testing progress for VVC to verify its effectiveness and MPEG Immersive Video (MIV) was officially moved into the final stage before it’s approved.
The official press release can be found here and comprises the following items:

MPEG Video Coding promotes MPEG Immersive Video (MIV) to the FDIS stage
Verification tests for more application cases of Versatile Video Coding (VVC)
MPEG Systems reaches the first milestone for Video Decoding Interface for Immersive Media
MPEG Systems further enhances the extensibility and flexibility of Network-based Media Processing
MPEG Systems completes support of Versatile Video Coding and Essential Video Coding in High-Efficiency Image File Format
Two MPEG White Papers:
- Versatile Video Coding (VVC)
- MPEG-G and its application of regulation and privacy

In this report, I’d like to focus on MIV and VVC including systems-related aspects as well as a brief update about DASH (as usual).

MPEG Immersive Video (MIV)

At the 135th MPEG meeting, MPEG Video Coding subgroup has promoted the MPEG Immersive Video (MIV) standard to the Final Draft International Standard (FDIS) stage. MIV was developed to support the compression of immersive video content in which multiple real or virtual cameras capture a real or virtual 3D scene. The standard enables storage and distribution of immersive video content over existing and future networks for playback with 6 Degrees of Freedom (6DoF) of view position and orientation.

3 Degrees of Freedom vs 6 Degrees of Freedom in Immersive Video

From a technical point of view, MIV is a flexible standard for multiview video with depth (MVD) that leverages the strong hardware support for commonly used video codecs to code volumetric video. The actual views may choose from three projection formats: (i) equirectangular, (ii) perspective, or (iii) orthographic. By packing and pruning views, MIV can achieve bit rates around 25 Mb/s and a pixel rate equivalent to HEVC Level 5.2.
The MIV standard is designed as a set of extensions and profile restrictions for the Visual Volumetric Video-based Coding (V3C) standard (ISO/IEC 23090-5). The main body of this standard is shared between MIV and the Video-based Point Cloud Coding (V-PCC) standard (ISO/IEC 23090-5 Annex H). It may potentially be used by other MPEG-I volumetric codecs under development. The carriage of MIV is specified through the Carriage of V3C Data standard (ISO/IEC 23090-10).
The test model and objective metrics are publicly available at https://gitlab.com/mpeg-i-visual.
At the same time, MPEG Systems has begun developing the Video Decoding Interface for Immersive Media (VDI) standard (ISO/IEC 23090-13) for video decoders’ input and output interfaces to provide more flexible use of the video decoder resources for such applications. At the 135th MPEG meeting, MPEG Systems has reached the first formal milestone of developing the ISO/IEC 23090-13 standard by promoting the text to Committee Draft ballot status. The VDI standard allows for dynamic adaptation of video bitstreams to provide the decoded output pictures in such a way so that the number of actual video decoders can be smaller than the number of the elementary video streams to be decoded. In other cases, virtual instances of video decoders can be associated with the portions of elementary streams required to be decoded. With this standard, the resource requirements of a platform running multiple virtual video decoder instances can be further optimized by considering the specific decoded video regions that are to be actually presented to the users rather than considering only the number of video elementary streams in use.
Immersive media applications and services offering various degrees of freedom are becoming more and more important. The Quality of Experience (QoE) for such applications is defined in a QUALINET white paper. Bitmovin actively supports application-oriented basic research in the context of the Christian Doppler Laboratory ATHENA, e.g., Objective and Subjective QoE Evaluation for Adaptive Point Cloud Streaming, From Capturing to Rendering: Volumetric Media Delivery With Six Degrees of Freedom, and SLFC: Scalable Light Field Coding.

Versatile Video Coding (VVC) updates

The third round of verification testing for Versatile Video Coding (VVC) has been completed. This includes the testing of High Dynamic Range (HDR) content of 4K ultra-high-definition (UHD) resolution using the Hybrid Log-Gamma (HLG) and Perceptual Quantization (PQ) video formats. The test was conducted using state-of-the-art high-quality consumer displays, emulating an internet streaming-type scenario.
On average, VVC showed on average approximately 50% bit rate reduction compared to High-Efficiency Video Coding (HEVC).
Additionally, the ISO/IEC 23008-12 Image File Format has been amended to support images coded using Versatile Video Coding (VVC) and Essential Video Coding (EVC).
VVC verification tests confirm a 50% bit rate reduction compared to its predecessor and as such, it is certainly considered a promising candidate for future deployments. Within Bitmovin we have successfully integrated current implementations of the VVC standard and in terms of licenses, it seems there’s a light at the end of the tunnel due to the recent announcement of Access Advance.

The latest MPEG-DASH Update

Finally, I’d like to provide a brief update on MPEG-DASH! At the 135th MPEG meeting, MPEG Systems issued a draft amendment to the core MPEG-DASH specification (i.e., ISO/IEC 23009-1) that provides further improvements of Preroll which is renamed to Preperiod and it will be further discussed during the Ad-hoc Group (AhG) period (please join the dash email list for further details/announcements). Additionally, this amendment includes some minor improvements for nonlinear playback. The so-called Technologies under Consideration (TuC) document comprises new proposals that did not yet reach consensus for promotion to any official standards documents (e.g., amendments to existing DASH standards or new parts). Currently, proposals for minimizing initial delay are discussed among others. Finally, libdash has been updated to support the MPEG-DASH schema according to the 5th edition.
An updated overview of DASH standards/features can be found in the Figure below.

The next meeting will be again an online meeting in October 2021 but MPEG is aiming to meet in person again in January 2021 (if possible).
Click here for more information about MPEG meetings and their developments

The post 135th MPEG Meeting Takeaways: MPEG Immersive Video is here appeared first on Bitmovin.

134th MPEG Meeting Takeaways: Standardizing Neural Network Compression for Multimedia Applications

Christian Timmerer — Tue, 11 May 2021 00:00:55 +0000

Preface

The 134th MPEG Meeting

Although MPEG meetings are based on the premise of setting video standards, the 134th MPEG meeting took an extra step forward towards future-oriented technologies. As organizations are looking for new ways to innovatively deliver more content using the most efficient methods, MPEG completed multiple carriages of codecs such as VVC, EVC, and V2C. In addition, the working group made progress on the application of neural network-oriented compression.
When it comes to the newly finalized codecs, VVC and LCEVC, MPEG ran some new quality verification tests. Lastly, the group made a set of calls for new proposals for upcoming topics.
The official press release can be found here and comprises the following items:

First International Standard on Neural Network Compression for Multimedia Applications
Completion of the carriage of VVC and EVC
Completion of the carriage of V3C in ISOBMFF
Call for Proposals
- New Advanced Genomics Features and Technologies
- MPEG-I Immersive Audio
- Coded Representation of Haptics
MPEG evaluated Responses on Incremental Compression of Neural Networks
Progression of MPEG 3D Audio Standards
The first milestone of development of Open Font Format (2nd amendment)
Verification tests:
- Low Complexity Enhancement Video Coding (LCEVC) Verification Test
- More application cases of Versatile Video Coding (VVC)
Standardization work on Version 2 of VVC and VSEI started

In this report, I’d like to focus on streaming-related aspects including a brief update about DASH (as usual).

First International Standard on Neural Network Compression for Multimedia Applications

Artificial neural networks have been adopted for a broad range of tasks in multimedia analysis and processing, such as visual and acoustic classification, extraction of multimedia descriptors, or image and video coding. The trained neural networks for these applications contain many parameters (i.e., weights), resulting in a considerable size. Thus, transferring them to several clients (e.g., mobile phones, smart cameras) benefits from a compressed representation of neural networks.
At the 134th MPEG meeting, MPEG Video ratified the first international standards on Neural Network Compression for Multimedia Applications (ISO/IEC 15938-17), designed as a toolbox of compression technologies. The specification contains different methods for

parameter reduction (e.g., pruning, sparsification, matrix decomposition),
parameter transformation (e.g., quantization), and
entropy coding

methods that can be assembled to encoding pipelines combining one or more (in the case of reduction) methods from each group.
The results show that trained neural networks for many common multimedia problems such as image or audio classification or image compression can be compressed by a factor of 10-20 with no performance loss and even by more than 30 with performance trade-off. The specification is not limited to particular neural network architecture and is independent of the neural network exchange format choice. The interoperability with common neural network exchange formats is described in the annexes of the standard.
As neural networks are becoming increasingly important, the communication thereof over heterogeneous networks to a plethora of devices raises various challenges including efficient compression that is inevitable and addressed in this standard. ISO/IEC 15938 is commonly referred to as MPEG-7 (or the “multimedia content description interface”) and this standard becomes now part 15 of MPEG-7.

Carriage of Media Assets

At the 134th MPEG meeting, MPEG Systems completed the carriage of various media assets in MPEG-2 Systems (Transport Stream) and the ISO Base Media File Format (ISOBMFF), respectively.
In particular, the standards for the carriage of Versatile Video Coding (VVC) and Essential Video Coding (EVC) over both MPEG-2 Transport Stream (M2TS) and ISO Base Media File Format (ISOBMFF) reached their final stages of standardization, respectively:

For M2TS, the standard defines constraints to elementary streams of VVC and EVC to carry them in the packetized elementary stream (PES) packets. Additionally, buffer management mechanisms and transport system target decoder (T-STD) model extension are also defined.
For ISOBMFF, the carriage of codec initialization information for VVC and EVC is defined in the standard. Additionally, it also defines samples and sub-samples reflecting the high-level bitstream structure and independently decodable units of both video codecs. For VVC, signaling, and extraction of a certain operating point is also supported.

Finally, MPEG Systems completed the standard for the carriage of Visual Volumetric Video-based Coding (V3C) data using ISOBMFF. Therefore, it supports media comprising multiple independent component bitstreams and considers that only some portions of immersive media assets need to be rendered according to the users’ position and viewport. Thus, the metadata indicating the relationship between the region in the 3D space data to be rendered and its location in the bitstream is defined. In addition, the delivery of the ISOBMFF file containing a V3C content over DASH and MMT is also specified in this standard.
To support the standardization efforts at MPEG, Bitmovin recently conducted various tests of VVC using the base VTM encoder library to very positive results, confirming a 45% bitrate improvement over HEVC.

Call for Proposals and Verification Tests

At the 134th MPEG meeting, MPEG issued three Call for Proposals (CfPs) that are briefly highlighted in the following:

Coded Representation of Haptics: Haptics provide an additional layer of entertainment and sensory immersion beyond audio and visual media. This CfP aims to specify a coded representation of haptics data, e.g., to be carried using ISO Base Media File Format (ISOBMFF) files in the context of MPEG-DASH or other MPEG-I standards.
MPEG-I Immersive Audio: Immersive Audio will complement other parts of MPEG-I (i.e., Part 3, “Immersive Video” and Part 2, “Systems Support”) in order to provide a suite of standards that will support a Virtual Reality (VR) or an Augmented Reality (AR) presentation in which the user can navigate and interact with the environment using 6 degrees of freedom (6 DoF), that being spatial navigation (x, y, z) and user head orientation (yaw, pitch, roll).
New Advanced Genomics Features and Technologies: This CfP aims to collect submissions of new technologies that can (i) provide improvements to the current compression, transport and indexing capabilities of the ISO/IEC 23092 standards suite, particularly applied to data consisting of very long reads generated by 3rd generation sequencing devices, (ii) provide the support for representation and usage of graph genome references, (iii) include coding modes relying on machine learning processes, satisfying data access modalities required by machine learning and providing higher compression, and (iv) support of interfaces with existing standards for the interchange of clinical data.

Detailed information, including instructions on how to respond to the call for proposals, the requirements that must be considered, the test data to be used, and the submission and evaluation procedures for proponents are available at www.mpeg.org.
Call for proposals typically mark the beginning of the formal standardization work whereas verification tests are conducted once a standard has been completed. At the 134th MPEG meeting and despite the difficulties caused by the pandemic situation, MPEG completed verification tests for Versatile Video Coding (VVC) and Low Complexity Enhancement Video Coding (LCEVC).
For LCEVC, verification tests measured the benefits of enhancing four existing codecs of different generations (i.e., AVC, HEVC, EVC, VVC) using tools as defined in LCEVC within two sets of tests:

The first set of tests compared LCEVC-enhanced encoding with full-resolution single-layer anchors. The average bit rate savings produced by LCEVC when enhancing AVC were determined to be approximately 46% for UHD and 28% for HD. When enhancing HEVC approximately 31% for UHD and 24% for HD. Test results tend to indicate an overall benefit also when using LCEVC to enhance EVC and VVC.
The second set of tests confirmed that LCEVC provided a more efficient means of resolution enhancement of half-resolution anchors than unguided up-sampling. Comparing LCEVC full-resolution encoding with the up-sampled half-resolution anchors, the average bit-rate savings when using LCEVC with AVC, HEVC, EVC and VVC were calculated to be approximately 28%, 34%, 38%, and 32% for UHD and 27%, 26%, 21%, and 21% for HD, respectively.

For VVC, it was already the second round of verification testing including the following aspects:

360-degree video for equirectangular and cubemap formats, where VVC shows on average more than 50% bit rate reduction compared to the previous major generation of MPEG video coding standard known as High-Efficiency Video Coding (HEVC), developed in 2013.
Low-delay applications such as compression of conversational (teleconferencing) and gaming content, where the compression benefit is about 40% on average,
HD video streaming, with an average bitrate reduction of close to 50%.

A previous set of tests for 4K UHD content completed in October 2020 showed similar gains. These verification tests used formal subjective visual quality assessment testing with “naïve” human viewers. The tests were performed under a strict hygienic regime in two test laboratories to ensure safe conditions for the viewers and test managers.

The latest MPEG-DASH Update

Finally, I’d like to provide a brief update on MPEG-DASH! At the 134th MPEG meeting, MPEG Systems recommended the approval of ISO/IEC FDIS 23009-1 5th edition. That is, the MPEG-DASH core specification will be available as the 5th edition sometime this year. Additionally, MPEG requests that this specification becomes freely available which also marks an important milestone in the development of the MPEG-DASH standard. Most importantly, the 5th edition of this standard incorporates CMAF support as well as other enhancements defined in the amendment of the previous edition. Additionally, the MPEG-DASH subgroup of MPEG Systems is already working on the first amendment to its 5th edition entitled preroll, nonlinear playback, and other extensions. It is expected that the 5th edition will also impact related specifications within MPEG but also in other Standards Developing Organizations (SDOs) such as DASH-IF, i.e., defining interoperability points (IOPs) for various codecs and others, or CTA WAVE (Web Application Video Ecosystem), i.e., defining device playback capabilities such as the Common Media Client Data (CMCD). Both DASH-IF and CTA WAVE provide means for (conformance) test infrastructure for DASH and CMAF.
An updated overview of DASH standards/features can be found in the Figure below.

The next meeting will be again an online meeting in July 2021.
Click here for more information about MPEG meetings and their developments
Check out the following links for other great reads!
A little lost about the formats and standards described above? Check out some other great educational content to learn more!

Bitmovin’s Video Developer Network (No Sign-up Required!)
[Blog Post] MPEG 133 Meeting Takeaways
[Blog Series] Cloud-based per-title encoding workflows on AWS
[Blog Post] Why Audio Encoding is Just as Important as Video Encoding
[E-Book] Ultimate Guide to Container Formats

The post 134th MPEG Meeting Takeaways: Standardizing Neural Network Compression for Multimedia Applications appeared first on Bitmovin.

133rd MPEG Meeting Takeaways: 6th Emmy® Award for MPEG Technology

Christian Timmerer — Mon, 01 Feb 2021 16:19:37 +0000

Preface

Bitmovin isn’t the only organization whose sole purpose is to shape the future of video – a few senior developers at Bitmovin along with me are active members of the Moving Pictures Expert Group (MPEG). Personally, I have been a member and attendant of MPEG for 15+ years and have been documenting the progress since early 2010. The 133rd MPEG meeting was officially the 4th consecutive virtual working group, officially marking a full year of online-only developments in video standardization. Although in-person meetings are highly effective, I’ve found that our new segmented process is proving just as productive virtually as it was when we were all together. Amongst the many innovations that this year has yielded, the 133rd MPEG meeting has once again made great leaps in the field of video standardization with further development of the EVC codec, new VVC codec milestones, and even an Emmy® Award.

The 133rd MPEG Meeting Makes Significant Progress in the Face of Adversity

The 133rd MPEG meeting was once again held as an online meeting, and this time, kicked off with great news, that MPEG is one of the organizations honored as a 72nd Annual Technology & Engineering Emmy® Awards Recipient, specifically the MPEG Systems File Format Subgroup and its ISO Base Media File Format (ISOBMFF) et al.
The official press release can be found here and comprises the following items:

6th Emmy® Award for MPEG Technology: MPEG Systems File Format Subgroup wins Technology & Engineering Emmy® Award
Essential Video Coding (EVC) verification test finalized
MPEG issues a Call for Evidence on Video Coding for Machines
Neural Network Compression for Multimedia Applications – MPEG calls for technologies for incremental coding of neural networks
MPEG Systems reaches the first milestone for supporting Versatile Video Coding (VVC) and Essential Video Coding (EVC) in the Common Media Application Format (CMAF)
MPEG Systems continuously enhances Dynamic Adaptive Streaming over HTTP (DASH)
MPEG Systems reached the first milestone to carry event messages in tracks of the ISO Base Media File Format

In this report, I’d like to focus on ISOBMFF, EVC, CMAF, and DASH.

MPEG Systems File Format Subgroup wins Technology & Engineering Emmy® Award

MPEG is pleased to report that the File Format subgroup of MPEG Systems is being recognized this year by the National Academy for Television Arts and Sciences (NATAS) with a Technology & Engineering Emmy® for their 20 years of work on the ISO Base Media File Format (ISOBMFF). This format was first standardized in 1999 as part of the MPEG-4 Systems specification and is now in its 6th edition as ISO/IEC 14496-12. It has been used and adopted by many other specifications, e.g.:

MP4 and 3GP file formats;
Carriage of NAL unit structured video in the ISO Base Media File Format which provides support for AVC, HEVC, VVC, EVC, and probably soon LCEVC;
MPEG-21 file format;
Dynamic Adaptive Streaming over HTTP (DASH) and Common Media Application Format (CMAF);
High-Efficiency Image Format (HEIF);
Timed text and other visual overlays in ISOBMFF;
Common encryption format;
Carriage of timed metadata metrics of media;
Derived visual tracks;
Event message track format;
Carriage of uncompressed video;
Omnidirectional Media Format (OMAF);
Carriage of visual volumetric video-based coding data;
Carriage of geometry-based point cloud compression data;
… to be continued!

This is MPEG’s fourth Technology & Engineering Emmy® Award (after MPEG-1 and MPEG-2 together with JPEG in 1996, Advanced Video Coding (AVC) in 2008, and MPEG-2 Transport Stream in 2013) and sixth overall Emmy® Award including the Primetime Engineering Emmy® Awards for Advanced Video Coding (AVC) High Profile in 2008 and High-Efficiency Video Coding (HEVC) in 2017, respectively.
In addition, Bitmovin, a highly involved contributor and tester of MPEG’s work was awarded an Emmy® for its Development of Massive Processing Optimized Compression Technologies.

Essential Video Coding (EVC) verification test finalized

At the 133rd MPEG meeting, a verification testing assessment of the Essential Video Coding (EVC) standard was completed. The first part of the EVC verification test using high dynamic range (HDR) and wide color gamut (WCG) was completed at the 132nd MPEG meeting. A subjective quality evaluation was conducted comparing the EVC Main profile to the HEVC Main 10 profile and the EVC Baseline profile to AVC High 10 profile, respectively:

Analysis of the subjective test results showed that the average bitrate savings for EVC Main profile are approximately 40% compared to HEVC Main 10 profile, using UHD and HD SDR content encoded in both random access and low delay configurations.
The average bitrate savings for the EVC Baseline profile compared to the AVC High 10 profile is approximately 40% using UHD SDR content encoded in the random-access configuration and approximately 35% using HD SDR content encoded in the low delay configuration.
Verification test results using HDR content had shown average bitrate savings for EVC Main profile of approximately 35% compared to HEVC Main 10 profile.

By providing significantly improved compression efficiency compared to HEVC and earlier video coding standards while encouraging the timely publication of licensing terms, the MPEG-5 EVC standard is expected to meet the market needs of emerging delivery protocols and networks, such as 5G, enabling the delivery of high-quality video services to an ever-growing audience.
In addition to verification tests, EVC, along with VVC and CMAF were subject to further improvements to their support systems.

MPEG Systems reaches the first milestone for supporting Versatile Video Coding (VVC) and Essential Video Coding (EVC) in the Common Media Application Format (CMAF)

At the 133rd MPEG meeting, MPEG Systems promoted Amendment 2 of the Common Media Application Format (CMAF) to Committee Draft Amendment (CDAM) status, the first major milestone in the ISO/IEC approval process. This amendment defines:

constraints to (i) Versatile Video Coding (VVC) and (ii) Essential Video Coding (EVC) video elementary streams when carried in a CMAF video track;
codec parameters to be used for CMAF switching sets with VVC and EVC tracks; and
support of the newly introduced MPEG-H 3D Audio profile.

It is expected to reach its final milestone in early 2022.
Bitmovin was one of the first adopters of CMAF (see here and here) and recently evaluated the base VVC encoder, VTM against the HEVC codec. Investigating and evaluating new formats or extensions of existing formats is very important for future applications and services aiming at improving the end user’s Quality of Experience (QoE).

HEVC vs VVC Speed Comparison findings by Bitmovin

MPEG Systems continuously enhances Dynamic Adaptive Streaming over HTTP (DASH)

At the 133rd MPEG meeting, MPEG Systems promoted Part 8 of Dynamic Adaptive Streaming over HTTP (DASH) also referred to as “Session-based DASH” to its final stage of standardization (i.e., Final Draft International Standard (FDIS)).
Historically, in DASH, every client uses the same Media Presentation Description (MPD), as it best serves the scalability of the service. However, there have been increasing requests from the industry to enable customized manifests for enabling personalized services. MPEG Systems has standardized a solution to this problem without sacrificing scalability. Session-based DASH adds a mechanism to the MPD to refer to another document, called Session-based Description (SBD), which allows per-session information. The DASH client can use this information (i.e., variables and their values) provided in the SBD to derive the URLs for HTTP GET requests.
An updated overview of DASH standards/features can be found in the Figure below.

MPEG DASH Status as of January 2021

The next meeting will be again an online meeting in April 2021.
Click here for more information about MPEG meetings and their developments
A little lost about the formats and standards described above? Check out some other great educational content to learn more!

Bitmovin’s Video Developer Network (No Sign-up Required!)
[Blog Post] MPEG 132 Meeting Takeaways
[Blog Series] Cloud-Agnostic Encoding Solutions for Dolby Vision and Dolby Atmos
[Blog Series] Super-Resolution with Machine Learning
[E-Book] Ultimate Guide to Container Formats

The post 133rd MPEG Meeting Takeaways: 6th Emmy® Award for MPEG Technology appeared first on Bitmovin.

132nd MPEG Meeting Takeaways: MPEG Continues to Progress – First Meeting with the New Structure

Christian Timmerer — Wed, 02 Dec 2020 16:09:39 +0000

Preface

Bitmovin isn’t the only organization whose sole purpose is to shape the future of video – a few senior developers at Bitmovin along with me are active members of the Moving Pictures Expert Group (MPEG). Personally, I have been a member and attendant of MPEG for 15+ years and have been documenting the progress since early 2010. This year yielded countless changes for consumers, developers, academics, and organizations alike, with most business operations shifting to a digital-first orientation. MPEG was no different, with three out of four our annual working sessions were strictly virtual. In addition to this general industry-wide shift, MPEG made further rearrangements to our structure to help streamline the standardization processes, especially since hosting a 400 person zoom session could end up being more distracting than productive.
The 131st meeting was our first sample of the new process and yielded the ratification of h.266 Versatile Video Coding (VVC) codec. Our latest session was even more productive with VVC quality verification tests, 3D geometry-based cloud compression, LCEVC, and Omnidirectional Media format methods moving to the final stages of standardization.

MPEG Continues to Progress – The 132nd Meeting is the First Under the New Structure

The 132nd MPEG meeting was the first meeting with the new structure as introduced previously. That is, ISO/IEC JTC 1/SC 29/WG 11 — the official name of MPEG under the ISO structure — was disbanded after the 131st MPEG meeting and some of the subgroups of WG 11 (MPEG) have been elevated to independent MPEG Working Groups (WGs) and Advisory Groups (AGs) of SC 29 rather than subgroups of the former WG 11. Thus, the MPEG community is now an affiliated group of WGs and AGs that will continue meeting together according to previous MPEG meeting practices and will further advance the standardization activities of the MPEG work program.
The 132nd MPEG meeting was the first meeting with the new structure as follows (incl. Convenors and position within WG 11 structure):

AG 2 MPEG Technical Coordination (Convenor: Prof. Jörn Ostermann; for overall MPEG work coordination and prev. known as the MPEG chairs meeting; it’s expected that one can also provide inputs to this AG without being a member of this AG),
WG 2 MPEG Technical Requirements (Convenor Dr. Igor Curcio; former Requirements subgroup),
WG 3 MPEG Systems (Convenor: Dr. Youngkwon Lim; former Systems subgroup),
WG 4 MPEG Video Coding (Convenor: Prof. Lu Yu; former Video subgroup),
WG 5 MPEG Joint Video Coding Team(s) with ITU-T SG 16 (Convenor: Prof. Jens-Rainer Ohm; former JVET),
WG 6 MPEG Audio Coding (Convenor: Dr. Schuyler Quackenbush; former Audio subgroup)
WG 7 MPEG Coding of 3D Graphics (Convenor: Prof. Marius Preda, former 3DG subgroup)
WG 8 MPEG Genome Coding (Convenor: Prof. Marco Mattaveli; newly established WG),
AG 3 MPEG Liaison and Communication (Convenor: Prof. Kyuheon Kim; (former Communications subgroup), and
AG 5 MPEG Visual Quality Assessment (Convenor: Prof. Mathias Wien; former Test subgroup).

The 132nd MPEG meeting was an online meeting with more than 300 participants who continued to work efficiently on standards for the future needs of the industry. As a group, MPEG started to explore new application areas that will benefit from standardized compression technology in the future. A new web site has been created and can be found at http://mpeg.org/.
The official press release can be found here and comprises the following items:

Versatile Video Coding (VVC) Ultra-HD Verification Test Completed and Conformance and Reference Software Standards Reach their First Milestone
MPEG Completes Geometry-based Point Cloud Compression Standard (G-PCC)
MPEG Evaluates Extensions and Improvements to MPEG-G and Announces a Call for Evidence on New Advanced Genomics Features and Technologies
MPEG Issues Draft Call for Proposals on the Coded Representation of Haptics
MPEG Evaluates Responses to MPEG IPR Smart Contracts CfP
MPEG Completes Standard on Harmonization of DASH and CMAF
MPEG Completes 2nd Edition of the Omnidirectional Media Format (OMAF)
MPEG Completes the Low Complexity Enhancement Video Coding (LCEVC) Standard

In this report, I’d like to focus on VVC, G-PCC, DASH/CMAF, OMAF, and LCEVC.

Versatile Video Coding (VVC) Ultra-HD Verification Test Completed and Conformance & Reference Software Standards Reach their First Milestone

MPEG completed a verification testing assessment of the recently ratified Versatile Video Coding (VVC) standard for ultra-high definition (UHD) content with standard dynamic range, so that it can be used in newer streaming and broadcast television applications. The verification test was performed using rigorous subjective quality assessment methods and showed that VVC provides a compelling gain over its predecessor — the High-Efficiency Video Coding (HEVC) standard. In particular, the verification test was performed using the VVC reference software implementation (VTM) and the recently released open-source encoder implementation of VVC (VVenC):

Using its reference software implementation (VTM), VVC showed bit rate savings of roughly 45% over HEVC for comparable subjective video quality.
Using VVenC, additional bit rate savings of more than 10% relative to VTM were observed, which at the same time runs significantly faster than the reference software implementation.

Additionally, the standardization work for both conformance testing and reference software for the VVC standard reached its first major milestone, i.e., progressing to the Committee Draft ballot in the ISO/IEC approval process. The conformance testing standard (ISO/IEC 23090-15) will ensure interoperability among the diverse applications that use the VVC standard, and the reference software standard (ISO/IEC 23090-16) will provide an illustration of the capabilities of VVC and a valuable example showing how the standard can be implemented. The reference software will further facilitate the adoption of the standard by being available for use as the basis of product implementations.
Bitmovin has always been among the early adopters of new video coding standards such as for HEVC and AV1. Thus, it is expected that we will also come up with early implementations for VVC in the context of ABR encoding, most likely in collaboration with our research partners such as the Alpen-Adria-Universität Klagenfurt (https://athena.itec.aau.at/).

MPEG Completes Geometry-based Point Cloud Compression Standard

MPEG promoted its ISO/IEC 23090-9 Geometry-based Point Cloud Compression (G‐PCC) standard to the Final Draft International Standard (FDIS) stage. G‐PCC addresses lossless and lossy coding of time-varying 3D point clouds with associated attributes such as colour and material properties. This technology is particularly suitable for sparse point clouds. ISO/IEC 23090-5 Video-based Point Cloud Compression (V‐PCC), which reached the FDIS stage in July 2020, addresses the same problem but for dense point clouds, by projecting the (typically dense) 3D point clouds onto planes, and then processing the resulting sequences of 2D images using video compression techniques. The generalized approach of G-PCC, where the 3D geometry is directly coded to exploit any redundancy in the point cloud itself, is complementary to V-PCC and particularly useful for sparse point clouds representing large environments.
Point clouds are typically represented by extremely large amounts of data, which is a significant barrier to mass-market applications. However, the relative ease of capturing and rendering spatial information compared to other volumetric video representations makes point clouds increasingly popular for displaying immersive volumetric data. The current draft reference software implementation of a lossless, intra-frame G‐PCC encoder provides a compression ratio of up to 10:1 and lossy coding of acceptable quality for a variety of applications with a ratio of up to 35:1.
By providing high immersion at currently available bit rates, the G‐PCC standard will enable various applications such as 3D mapping, indoor navigation, autonomous driving, advanced augmented reality (AR) with environmental mapping, and cultural heritage.
Bitmovin is actively conducting research and development with respect to the adaptive streaming of point cloud data in collaboration with our research partners such as the Alpen-Adria-Universität Klagenfurt. In this context, we have recently co-authored a paper on “From Capturing to Rendering: Volumetric Media Delivery With Six Degrees of Freedom” that has been published in the IEEE Communications Magazine.

MPEG Finalizes the Harmonization of DASH and CMAF

MPEG successfully completed the harmonization of Dynamic Adaptive Streaming over HTTP (DASH) with Common Media Application Format (CMAF) featuring a DASH profile for the use with CMAF (as part of the 1st Amendment of ISO/IEC 23009-1:2019 4th edition).
CMAF and DASH segments are both based on the ISO Base Media File Format (ISOBMFF), which helps enable a smooth integration of both technologies. Most importantly, this DASH profile defines (a) a normative mapping of CMAF structures to DASH structures and (b) how to use Media Presentation Description (MPD) as a manifest format.
Additional tools added to this amendment include

DASH events and timed metadata track timing and processing models with in-band event streams,
a method for specifying the resynchronization points of segments when the segments have internal structures that allow container-level resynchronization,
an MPD patch framework that allows the transmission of partial MPD information as opposed to the complete MPD using the XML patch framework as defined in IETF RFC 5261, and
content protection enhancements for efficient signaling.

It is expected that the 5th edition of the MPEG DASH standard (ISO/IEC 23009-1) containing this change will be issued at the 133rd MPEG meeting in January 2021. An overview of DASH standards/features can be found in the Figure below.

CMAF enables low latency DASH thanks to the introduction of chunks (in CMAF) and HTTP/1.1 chunked transfer encoding (CTE). Bitmovin’s R&D efforts with respect to low latency streaming resulted in ACTE (ABR for Chunked Transfer-Encoding), a bandwidth prediction scheme for low-latency chunked streaming.

MPEG Completes 2nd Edition of the Omnidirectional Media Format

MPEG completed the standardization of the 2nd edition of the Omnidirectional Media Format (OMAF) by promoting ISO/IEC 23009-2 to Final Draft International Standard (FDIS) status including the following features:

“Late binding” technologies to deliver and present only that part of the content that adapts to the dynamically changing users’ viewpoint. To enable an efficient implementation of such a feature, this edition of the specification introduces the concept of bitstream rewriting, in which a compliant bitstream is dynamically generated that, by combining the received portions of the bitstream, covers only the users’ viewport on the client.
Extension of OMAF beyond 360-degree video. This edition introduces the concept of viewpoints, which can be considered as user-switchable camera positions for viewing content or as temporally contiguous parts of a storyline to provide multiple choices for the storyline a user can follow.
Enhances the use of video, image, or timed text overlays on top of omnidirectional visual background video or images related to a sphere or a viewport.

Bitmovin’s research efforts related to 360-degree video streaming resulted in the first prototypes of viewport adaptive streaming awarded by DASH-IF excellence in DASH award.

MPEG Completes the Low Complexity Enhancement Video Coding Standard

MPEG is pleased to announce the completion of the new ISO/IEC 23094-2 standard, i.e., Low Complexity Enhancement Video Coding (MPEG-5 Part 2 LCEVC), which has been promoted to Final Draft International Standard (FDIS) at the 132nd MPEG meeting.

LCEVC adds an enhancement data stream that can appreciably improve the resolution and visual quality of reconstructed video with an effective compression efficiency of limited complexity by building on top of existing and future video codecs.
LCEVC can be used to complement devices originally designed only for decoding the base layer bitstream, by using firmware, operating system, or browser support. It is designed to be compatible with existing video workflows (e.g., CDNs, metadata management, DRM/CA) and network protocols (e.g., HLS, DASH, CMAF) to facilitate the rapid deployment of enhanced video services.
LCEVC can be used to deliver higher video quality in limited bandwidth scenarios, especially when the available bit rate is low for high-resolution video delivery and decoding complexity is a challenge. Typical use cases include mobile streaming and social media, and services that benefit from high-density/low-power transcoding.

The next meeting will be again an online meeting in January 2021.
Click here for more information about MPEG meetings and their developments
Check out the following links for other great reads!
A little lost about the formats and standards described above? Check out some other great educational content to learn more!

Bitmovin’s Video Developer Network (No Sign-up Required!)
[Blog Post] MPEG 131 Meeting Takeaways
[Blog Series] Cloud-Agnostic Encoding Solutions for Dolby Vision and Dolby Atmos
[Blog Series] Super-Resolution with Machine Learning
[E-Book] Ultimate Guide to Container Formats

The post 132nd MPEG Meeting Takeaways: MPEG Continues to Progress – First Meeting with the New Structure appeared first on Bitmovin.

131st MPEG Meeting Takeaways – Future Codecs are now: VVC is finalized

Christian Timmerer — Wed, 29 Jul 2020 07:25:37 +0000

Preface

“Shaping the Future of Video” is not just a catchy slogan that we like to throw around, rather it’s Bitmovin’s company vision. Not only do we keep a close eye on the industry trends, but we’re actively taking part in standardization activities to improve the quality of video technologies. I have been a member and attendant of the Moving Pictures Experts Group for 15+ years and have been documenting the progress since early 2010.

Virtual Meetings Continue – Yet Productivity Prevails

As the global pandemic rages on, most in-person events remain in virtual form, this stayed true for the 131st MPEG meeting. However, as most home office employees will tell you nowadays, productivity is at an all-time high – with multiple progressions across video and audio compression technologies taking major steps towards widespread application, despite certain controversies in the working group. In fact, MPEG was still able to make progress on an unprecedented amount of developments, including, but not limited to: allowing the VVC codec to be finalized, Video-based Point Cloud Compression to move to the FDIS stage, and further support HDR formats.

Future Compression Technology Moves Forward – VVC, Point Cloud Compression, and MPEG-5 Progress

Just in the middle of the SC29 (i.e., MPEG’s parent body within ISO) restructuring process, MPEG successfully ratified — jointly with ITU-T’s VCEG within JVET — its next-generation video codec among other interesting results from the 131st MPEG meeting:
Standards progressing to final approval ballot (FDIS)

MPEG Announces VVC – the Versatile Video Coding Standard
Point Cloud Compression – MPEG promotes a Video-based Point Cloud Compression Technology to the FDIS stage
MPEG-H 3D Audio – MPEG promotes Baseline Profile for 3D Audio to the final stage

Call for Proposals

Call for Proposals on Technologies for MPEG-21 Contracts to Smart Contracts Conversion
MPEG issues a Call for Proposals on extension and improvements to ISO/IEC 23092 standard series

Standards progressing to the first milestone of the ISO standard development process

Widening support for storage and delivery of MPEG-5 EVC
Multi-Image Application Format adds support of HDR
Carriage of Geometry-based Point Cloud Data progresses to Committee Draft
MPEG Immersive Video (MIV) progresses to Committee Draft
Neural Network Compression for Multimedia Applications – MPEG progresses to Committee Draft
MPEG issues Committee Draft of Conformance and Reference Software for Essential Video Coding (EVC)

The corresponding press release of the 131st MPEG meeting can be found here: https://mpeg-standards.com/meetings/mpeg-131/. This report focused on video coding featuring VVC as well as PCC and systems aspects (file format, DASH).

MPEG Announces VVC – the Versatile Video Coding Standard

MPEG is pleased to announce the completion of the new Versatile Video Coding (VVC) standard at its 131st MPEG meeting. The document has been progressed to its final approval ballot as ISO/IEC 23090-3 and will also be known as H.266 in the ITU-T.
VVC is the latest in a series of very successful standards for video coding that have been jointly developed with ITU-T, and it is the direct successor to the well-known and widely used High-Efficiency Video Coding (HEVC) and Advanced Video Coding (AVC) standards. VVC provides a major benefit in compression over HEVC. Plans are underway to conduct a verification test with formal subjective testing to confirm that VVC achieves an estimated 50% bit rate reduction versus HEVC for equal subjective video quality. Test results have already demonstrated that VVC typically provides about a 40%-bit rate reduction for 4K/UHD video sequences in tests using objective metrics (i.e., PSNR, VMAF, MS-SSIM). Application areas especially targeted for the use of VVC include

ultra-high-definition 4K and 8K video,
video with a high dynamic range and wide colour gamut, and
video for immersive media applications such as 360° omnidirectional video.

Furthermore, VVC is designed for a wide variety of types of video such as camera captured, computer-generated, and mixed content for screen sharing, adaptive streaming, game streaming, video with scrolling text, etc. Conventional standard-definition and high-definition video content are also supported with similar gains in compression. In addition to improving coding efficiency, VVC also provides highly flexible syntax supporting such use cases as (i) subpicture bitstream extraction, (ii) bitstream merging, (iii) temporal sub-layering, and (iv) layered coding scalability.
The current performance of VVC compared to HEVC-HM is shown in the figure below (taken from https://bit.ly/mpeg131) which confirms the statement above but also highlights the increased complexity. Please note that VTM9 is not optimized for speed but functionality (i.e., compression efficiency).

MPEG also announces completion of ISO/IEC 23002-7 “Versatile supplemental enhancement information for coded video bitstreams” (VSEI), developed jointly with ITU-T as Rec. ITU-T H.274. The new VSEI standard specifies the syntax and semantics of video usability information (VUI) parameters and supplemental enhancement information (SEI) messages for use with coded video bitstreams. VSEI is especially intended for use with VVC, although it is drafted to be generic and flexible so that it may also be used with other types of coded video bitstreams. Once specified in VSEI, different video coding standards and systems-environment specifications can re-use the same SEI messages without the need for defining special-purpose data customized to the specific usage context.
At the same time, the Media Coding Industry Forum (MC-IF) announces a VVC patent pool fostering an initial meeting on September 1, 2020. The aim of this meeting is to identify tasks and to propose a schedule for VVC pool fostering with the goal to select a pool facilitator/administrator by the end of 2020. MC-IF is not facilitating or administering a patent pool.
At the time of writing this blog post, it is probably too early to make an assessment of whether VVC will share the fate of HEVC or AVC (w.r.t. patent pooling). AVC is still the most widely used video codec but with AVC, HEVC, EVC, VVC, LCEVC, AV1, (AV2), and probably also AVS3 — did I miss anything? — the competition and pressure is certainly increasing. Bitmovin is committed to supporting novel video codecs within its product portfolio and specifically enabled efficient multi-codec support (view Bitmovin’s analysis here). We will further evaluate, optimize, and integrate novel video codecs upon becoming available in order to offer the best possible Quality of Experience (QoE) for our customers and their end-users.

MPEG promotes a Video-based Point Cloud Compression Technology to the FDIS stage

At the 131st MPEG meeting, the group promoted its Video-based Point Cloud Compression (V-PCC) standard to the Final Draft International Standard (FDIS) stage. V-PCC addresses lossless and lossy coding of 3D point clouds with associated attributes such as colors and reflectance. Point clouds are typically represented by extremely large amounts of data, which is a significant barrier for mass-market applications. However, the relative ease to capture and render spatial information as point clouds compared to other volumetric video representations makes point clouds increasingly popular to present immersive volumetric data. With the current V-PCC encoder implementation providing compression in the range of 100:1 to 300:1, a dynamic point cloud of one million points could be encoded at 8 Mbit/s with good perceptual quality. Real-time decoding and rendering of V-PCC bitstreams have also been demonstrated on current mobile hardware.
The V-PCC standard leverages video compression technologies and the video ecosystem in general (hardware acceleration, transmission services, and infrastructure) while enabling new kinds of applications. The V-PCC standard contains several profiles that leverage existing AVC and HEVC implementations, which may make them suitable to run on existing and emerging platforms. The standard is also extensible to upcoming video specifications such as Versatile Video Coding (VVC) and Essential Video Coding (EVC).
The V-PCC standard is based on Visual Volumetric Video-based Coding (V3C), which is expected to be re-used by other MPEG-I volumetric codecs under development. MPEG is also developing a standard for the carriage of V-PCC and V3C data (ISO/IEC 23090-10) which has been promoted to DIS status at the 130th MPEG meeting.
By providing high-level immersion at currently available bandwidths, the V-PCC standard is expected to enable several types of applications and services such as six Degrees of Freedom (6 DoF) immersive media, virtual reality (VR) / augmented reality (AR), immersive real-time communication and cultural heritage.
Bitmovin collaborates with the Alpen-Adria-Universität Klagenfurt in the context of the ATHENA project where the paper on the “Objective and Subjective QoE Evaluation for Adaptive Point Cloud Streaming” won the best paper award at QoMEX 2020. This work was a collaboration between Ghent University, Alpen-Adria-Universität Klagenfurt, Bitmovin, Ozyegin University, Networked Media, and AIT Austrian Institute of Technology.

MPEG Systems related News

Finally, I’d like to share news related to MPEG systems and the carriage of video data as depicted in the figure below (taken from https://bit.ly/mpeg131). In particular, the carriage of VVC (and also EVC) has been now enabled in MPEG-2 Systems (specifically within the transport stream) and in the various file formats (specifically within the NAL file format). The latter is used also in CMAF and DASH which makes VVC (and also EVC) ready for HTTP adaptive streaming (HAS).

What about DASH and CMAF?

CMAF maintains the so-called technology under consideration document which contains — among other things — a proposed VVC CMAF profile. Additionally, there are two exploration activities related to CMAF, i.e., (i) multi-stream support and (ii) storage, archiving, and content management for CMAF files.
DASH works on potential improvement for the first amendment to ISO/IEC 23009-1 4th edition related to CMAF support, events processing model, and other extensions. Additionally, there’s a working draft for a second amendment to ISO/IEC 23009-1 4th edition enabling bandwidth change signaling track and other enhancements. Furthermore, ISO/IEC 23009-8 session-based DASH operations has been advanced to Draft International Standard (see also the last report).
An overview of the current status of MPEG-DASH can be found in the figure below.

The next meeting will be again an online meeting in October 2020.
Finally, MPEG organized a Webinar presenting results from the 131st MPEG meeting. The slides and video recordings are available here: https://bit.ly/mpeg131.
Click here for more information about MPEG meetings and their developments
Check out the following links for other great reads!
A little lost about the formats and standards described above? Check out some other great educational content to learn more!

Bitmovin’s Video Developer Network (No Sign-up Required!)
[Blog Post] MPEG 130 Meeting Takeaways
[Blog Series] Live Low Latency Streaming
[Blog Series] Super-Resolution with Machine Learning
[E-Book] Ultimate Guide to Container Formats

To see these formats tested against existing Per-Title Encoding products, by industry expert Jan Ozer, Check out the following Whitepaper: Choosing the Best Per-Title Encoding Technology

The post 131st MPEG Meeting Takeaways – Future Codecs are now: VVC is finalized appeared first on Bitmovin.

MPEG Meeting – Bitmovin

144th MPEG Meeting Takeaways: Understanding Quality Impacts of Learning-based Codecs and Enhancing Green Metadata

Table of Contents

Preface

The 144th MPEG meeting highlights

Visual Quality Assessment

MPEG Systems-related Standards

MPEG-DASH Updates

143rd MPEG Meeting Takeaways: Green metadata support added to VVC for improved energy efficiency

Table of Contents

Preface

The 143rd MPEG Meeting Highlights

ISOBMFF Enhancements

Video Codec Enhancements

The latest MPEG-DASH Update

142nd MPEG Meeting Takeaways: MPEG issues Call for Proposals for Feature Coding for Machines

Preface

The 142nd MPEG Meeting – MPEG issues Call for Proposals for Feature Coding for Machines

Feature Coding for Machines

9th Edition of MPEG-2 Systems

Storage and Delivery of Haptics Data

Neural Network Coding (NNC)

Verification Test Report and Conformance and Reference Software for MPEG Immersive Video

The latest MPEG-DASH Update

139th MPEG Meeting Takeaways: MPEG issues Call for Evidence for Video Coding for Machines

Preface

The 139th MPEG Meeting – MPEG issues Call for Evidence to drive the future of computer vision and smart transportation

Video Coding for Machines (VCM)

Green Metadata

Third Edition of Common Media Application Format (CMAF)

New Amendment for Versatile Supplemental Enhancement Information (VSEI) containing Technology for Neural Network-based Post Filtering

The latest MPEG-DASH Update

137th MPEG Meeting Takeaways: MPEG Wins Two More Emmy® Awards

Preface

The 137th MPEG Meeting – Immersive Experiences Move Forward

MPEG Systems Wins Two More Technology & Engineering Emmy® Awards

Video Coding Updates

Advanced Video Coding

High-Efficiency Video Coding

Versatile Video Coding

Beyond VVC

The latest MPEG-DASH Update

135th MPEG Meeting Takeaways: MPEG Immersive Video is here

Preface

The 135th MPEG Meeting

MPEG Immersive Video (MIV)

Versatile Video Coding (VVC) updates

The latest MPEG-DASH Update

134th MPEG Meeting Takeaways: Standardizing Neural Network Compression for Multimedia Applications

Preface

The 134th MPEG Meeting

First International Standard on Neural Network Compression for Multimedia Applications

Carriage of Media Assets

Call for Proposals and Verification Tests

The latest MPEG-DASH Update

133rd MPEG Meeting Takeaways: 6th Emmy® Award for MPEG Technology

Preface

The 133rd MPEG Meeting Makes Significant Progress in the Face of Adversity

MPEG Systems File Format Subgroup wins Technology & Engineering Emmy® Award

Essential Video Coding (EVC) verification test finalized

MPEG Systems reaches the first milestone for supporting Versatile Video Coding (VVC) and Essential Video Coding (EVC) in the Common Media Application Format (CMAF)

MPEG Systems continuously enhances Dynamic Adaptive Streaming over HTTP (DASH)

132nd MPEG Meeting Takeaways: MPEG Continues to Progress – First Meeting with the New Structure

Preface

MPEG Continues to Progress – The 132nd Meeting is the First Under the New Structure

Versatile Video Coding (VVC) Ultra-HD Verification Test Completed and Conformance & Reference Software Standards Reach their First Milestone

MPEG Completes Geometry-based Point Cloud Compression Standard

MPEG Finalizes the Harmonization of DASH and CMAF

MPEG Completes 2nd Edition of the Omnidirectional Media Format

MPEG Completes the Low Complexity Enhancement Video Coding Standard

131st MPEG Meeting Takeaways – Future Codecs are now: VVC is finalized

Preface

Virtual Meetings Continue – Yet Productivity Prevails

Future Compression Technology Moves Forward – VVC, Point Cloud Compression, and MPEG-5 Progress

MPEG Announces VVC – the Versatile Video Coding Standard

MPEG promotes a Video-based Point Cloud Compression Technology to the FDIS stage

MPEG Systems related News

What about DASH and CMAF?