MPEG-DASH – Bitmovin

The Bitmovin Innovators Network “Better Together” Award Winners!

Brandon Zupancic — Tue, 14 May 2024 11:48:00 +0000

The dust has now settled from NAB, and I am still looking back in awe at the success of the Bitmovin Innovators Network and the community that we’ve built, together. A personal highlight for me was our exclusive semi-annual Bitmovin Innovators Network Partner Executive Networking Event which had over 100 attendees who joined to learn and network. The event included several customer success stories, including Quickplay presenting a “Better Together” customer success story regarding a large Regional Sports Network (RSN); and a fireside chat with OneFootball and Akamai.

We concluded the event with our first annual Bitmovin Innovators Network partner awards to recognize and celebrate the amazing work of our partners who embrace the fact that the industry is “Better Together”, by creating solutions with partners that are designed to simplify customers’ video workload needs and advance the viewing experience for audiences.

I am incredibly proud to share the winners of the Bitmovin Innovators Network partner awards below, and the contributions they’ve made:

Accenture – Global Systems Integrator of the Year:

Accenture and Bitmovin exemplify the “better together” approach through their close strategic partnership, including an ongoing collaboration with the world’s largest motorsports content owners that led to joint engagements with several of the largest sports and media brands in the world.

Broadpeak – Global ISV Partner of the Year:

Broadpeak embodies the “Better Together” spirit through its unwavering strategic collaboration with Bitmovin. This powerful partnership has yielded several key benefits. Together, they have developed solutions that integrated with Bitmovin’s encoder, player, and analytics, resulting in improved workflows for customers; created a consistent two-way communication between sales teams which has resulted in successful deals with European media brands, and joint marketing and PR initiatives at local events to strengthen their joint brand presence.

MediaKind – Global Service Provider Partner of the Year:

MediaKind and Bitmovin have developed and maintained a robust strategic partnership that has launched sports applications for world-renowned sports leagues. These applications, including launching an app with a sports league on Apple Vision Pro that garnered rave reviews at the Apple launch event, have significantly boosted market visibility for both brands.

Microsoft Azure Marketplace – Cloud Marketplace of the Year:

Bitmovin has had unprecedented success with the Microsoft Azure Marketplace, including more than 200 new customer wins since June 2023. Azure Marketplace has quickly become Bitmovin’s largest and most successful sales channel.

Nomad Media – Americas Regional Channel Partner of the Year:

Nomad Media has deployed over 30 customers on the Bitmovin Play platform in 2023 alone as part of its Nomad Media platform. Nomad Media has also innovated on the player capabilities with dynamic multi-view capabilities. These advancements were showcased to major US clients, propelling both companies forward. This collaboration not only built a strong pipeline but also significantly boosted brand recognition in the US market.

G&L Geißendörfer & Leschinsky – EMEA Regional Channel Partner of the Year:

G&L is a proactive and committed industry partner, who has worked with Bitmovin on both successful sales and marketing initiatives. The collaboration between the two companies resulted in joint revenue, a new logo, and G&L also exhibited on the Bitmovin stand at IBC 2023 where it highlighted how the two companies’ solutions work together. Bitmovin and G&L also hosted a joint CMCD webinar together, which attracted attendees from key German broadcasters and various telecoms and content providers, and it recently published an e-commerce case study with Home Shopping Europe.

Viet Communications – APAC Regional Channel Partner of the Year

Vietcoms was the first licensee for the Bitmovin Player in the Asia Pacific region. Vietcoms was selected for its hard work and efforts in securing our impressive player business in Vietnam and developing agile operational models to meet the specific customer and TelCo business needs and technical requirements.

Once again, I’d like to give huge congratulations to all the winners. A huge thank you to everyone who attended the Bitmovin Innovators Network Partner Executive Networking Event, and to every single one of our partners who continue to embrace the spirit of “Better Together.” IBC is just around the corner, and we will have some exciting initiatives and announcements coming soon to share with you ahead of the show.

The post The Bitmovin Innovators Network “Better Together” Award Winners! appeared first on Bitmovin.

The Essential Guide to SCTE-35

Andy Francis — Sat, 20 Jan 2024 01:13:24 +0000

Everything you need to know about SCTE-35, the popular event signaling standard that powers dynamic ad insertion, digital program insertion, blackouts and more for TV, live streams and on-demand video.

What is SCTE?

The acronym SCTE is short for The Society of Cable and Telecommunications Engineers. SCTE is a non-profit professional organization that creates technical standards and educational resources for the advancement of cable telecommunications engineering and the wider video industry. When talking about it, you may hear people abbreviate SCTE with the shorthand slang “Scutty”.

SCTE was founded in 1969 as The Society of Cable Television Engineers, but changed its name in 1995 to reflect a broader scope as fiber optics and high-speed data applications began playing a bigger role in the cable tv industry and became the responsibility of its engineers. Currently there are over 19,000 individual SCTE members and nearly 300 technical standards in their catalog, including SCTE-35, which will be the focus of this post.

What is SCTE-35?

SCTE-35 was first published in 2001 and is the core signaling standard for advertising and program control for content providers and content distributors. It was initially titled “Digital Program Insertion Cueing Message for Cable” but recent revisions have dropped “for Cable” as it has proven useful and versatile enough to be extended to OTT workflows and streaming applications. There have been several revisions and updates published to incorporate member feedback and adapt to advancements in the industry, most recently on Nov, 30 2023.

SCTE-35 signals are used to identify national and local ad breaks as well as program content like intro/outro credits, chapters, blackouts, and extensions when a live program like a sporting event runs long. Initially, these messages were embedded as cue tones that dedicated cable tv hardware or equipment could pick up and enable downstream systems to act on. For modern streaming applications, they are usually included within an MPEG-2 transport stream PID and then converted into metadata that is embedded in HLS and MPEG-DASH manifests.

SCTE-35 markers and their applications for streaming video

While SCTE-35 markers are primarily used for ad insertion in OTT workflows, they can also signal many other events that allow an automation system to tailor the program output for compliance with local restrictions or to improve the viewing experience. Let’s take a look at some common use cases and benefits of using SCTE-35 markers.

Use cases and benefits of SCTE-35

Ad Insertion – As mentioned above, inserting advertisements into a video stream is the main use case for SCTE-35 markers. They provide seamless splice points for national, local and individually targeted dynamic ad replacement. This allows for increased monetization opportunities for broadcasters and content providers by enabling segmenting of viewers into specific demographics and geographic locations. When ad content can be tailored for a particular audience, advertisers are willing to pay more, leading to higher revenue for content providers and distributors.

Ad Break Example.
source: SCTE-35 specification

Program boundary markers – Another common use case is to signal a variety of program boundaries. This includes the start and end of programs, chapters, ad breaks and unexpected interruptions or extensions. Many of these become particularly useful in Live-to-VOD scenarios. Ad break start/end markers can be used as edit points in a post-production workflow to automate the removal of ads for viewers with ad-free subscriptions. A program end marker can be used to trigger the next episode in a series for binge viewing sessions if so desired. All of these markers open new possibilities for improving the user experience and keeping your audience happy and engaged.

Blackouts and alternate content – Another less common, but important use case is to signal blackouts, when a piece of content should be replaced or omitted from a broadcast. This often applies to regional blackouts for sporting events. Respecting blackout restrictions is crucial for avoiding fines and loss of access to future events. Using SCTE-35 allows your automation system to take control and ensure you are compliant.

Workflow example with Program Boundaries and Blackouts.
source: SCTE-35 specification

Types of SCTE-35 markers

SCTE-35 markers are delivered in-band, meaning they are embedded or interleaved with the audio and video signals. There are five different command types defined in the specification. The first 3 are legacy commands: splice_null(), splice_schedule() and splice_insert(), but splice_insert() is still used quite often. The bandwidth_reservation() command may be needed in some satellite transmissions, but the most commonly used command with modern workflows is time_signal(). Let’s take a closer look at the 2 most important command types, splice_insert and time_signal.

splice_insert commands

Splice_insert commands are used to mark splice events, when some new piece of content like an ad should be inserted in place of the program or a switch from an ad break back into the main program. Presentation time stamps are used to note the exact timing of the splice, enabling seamless, frame accurate switching.

time_signal commands

Time_signal commands can also be used to insert new content at a splice point, but together with segmentation descriptors, they can handle other use cases like the program boundary markers mentioned above. This enables the segmenting and labeling of content sections for use by downstream systems.

Using SCTE-35 markers in streaming workflows

MPEG-2 transport streams

In MPEG-2 transport streams, SCTE markers are carried in-band on their own PID within the transport stream mux. These streams are usually used as contribution or backhaul feeds and in most cases are not directly played by the consumer. They may be delivered over dedicated satellite or fiber paths or via the public internet through the use of streaming protocols like SRT or proprietary solutions like Zixi.

HLS

The Bitmovin Live Encoder supports a range of different HLS tags that are written when SCTE-35 triggers are parsed from the MPEG-TS input stream. Multiple marker types can be enabled for each HLS manifest. Which marker types to use depends on the consumer of the HLS manifest. An example consumer would be a Server Side Ad Insertion (SSAI) service. They usually state in their documentation which HLS tags they support for signaling SCTE-35 triggers.

EXT_X_CUE_OUT_IN: Ad markers will be inserted using #EXT-X-CUE-OUT and #EXT-X-CUE-IN tags.

EXT_OATCLS_SCTE35: Ad markers will be inserted using #EXT-OATCLS-SCTE35 tags. They contain the base64 encoded raw bytes of the original SCTE-35 trigger.
EXT_X_SPLICEPOINT_SCTE35: Ad markers will be inserted using #EXT-X-SPLICEPOINT-SCTE35 tags. They contain the base64 encoded raw bytes of the original SCTE-35 trigger.
EXT_X_SCTE35: Ad markers will be inserted using #EXT-X-SCTE35 tags. They contain the base64 encoded raw bytes of the original SCTE-35 trigger.
EXT_X_DATERANGE: Ad markers will be inserted using #EXT-X-DATERANGE tags as specified in the HLS specification. They contain the ID, start timestamp, and hex-encoded raw bytes of the original SCTE-35 trigger.

Example HLS manifest with Cue Out, duration and Cue In tags:

#EXTINF:4.0,
2021-07/video/hls/360/seg_18188.ts
#EXT-X-CUE-OUT:120.000
…
#EXTINF:4.0,
2021-07/video/hls/360/seg18218.ts
#EXT-X-CUE-IN

Example HLS manifest using EXT-OATCLS-SCTE35 tag with base64 encoded marker:

#EXTINF:4.0,
2021-07/video/hls/360/seg_18190.ts
#EXT-OATCLS-
SCTE35:/DBcAAAAAAAAAP/wBQb//ciI8QBGAh1DVUVJXQk9EX+fAQ5FUDAxODAzODQwMDY2NiEEZAIZQ1VFSV0JPRF/3wABLit7AQVDMTQ2NDABAQEKQ1VFSQCAMTUwKnPhdcU=

Note: You can copy the base64 encoded marker above, (beginning with the first / after SCTE35: ) and paste it into this payload parser to see the full message structure.

MPEG-DASH

In MPEG-DASH streams, SCTE-35 defined breaks and segments are added as new periods to the .mpd file.



   



    
   
     
       
               /DAlAAAAAAAAAP/wFAUAAAAEf+/+kybGyP4BSvaQAAEBAQAArky/3g==

With SCTE messages embedded in the stream, various forms of automation can be triggered, whether it’s server or client-side ad insertion, content switching, interactive elements in the application or post-production processing.

Bitmovin Live Encoding SCTE Support

SCTE message pass-through and processing

Bitmovin supports the parsing of SCTE-35 triggers from MPEG-TS input streams for Live Encodings, the triggers are shown below as splice decisions. The triggers are then mapped to HLS manifest tags.

Splice Decisions

Certain SCTE-35 triggers signal that an advertisement or break (to from the original content starts or ends. The following table describes how the Bitmovin Live Encoder treats SCTE-35 trigger types and SCTE-35 Segmentation Descriptor types as splice decision points, and the compatibility of those types with the different command types, Spice Insert and Time Signal.

✓= Supported

= Not currently supported

Segmentation UPID Type (Start/End)	Descriptor Type Name	SPLICE_INSERT	TIME_SIGNAL
–		✓
0x10, 0x11	PROGRAM	✓
0x20, 0x21	CHAPTER	✓
0x22, 0x23	BREAK	✓	✓
0x30, 0x31	PROVIDER_ADVERTISEMENT	✓	✓
0x32, 0x33	DISTRIBUTOR_ADVERTISEMENT	✓	✓
0x34, 0x35	PROVIDER_PLACEMENT_OPPORTUNITY	✓	✓
0x36, 0x37	DISTRIBUTOR_PLACEMENT_OPPORTUNITY	✓	✓
0x40, 0x41	UNSCHEDULED_EVENT	✓
0x42, 0x43	ALTERNATE_CONTENT_OPPORTUNITY	✓
0x44, 0x45	PROVIDER_AD_BLOCK	✓
0x46, 0x47	DISTRIBUTOR_AD_BLOCK	✓
0x50, 0x51	NETWORK	✓

Live cue point insertion API

In addition to the SCTE-35 pass-through mode, Bitmovin customers can insert new ad break cue points in real-time, using live controls in the user dashboard or via API. These can be inserted independently of existing SCTE-35 markers in the input stream and may be useful for live events when the time between ads is variable depending on breaks in the action. This allows streamers that don’t have SCTE-35 markers embedded in their source to take advantage of the same downstream ad insertion systems for increased monetization.

API Call:

POST /encoding/encodings/{encoding_id}/live/scte-35-cue
    {
      "cueDuration": 60, // duration in seconds between cue tags (ad break length)
    }

The #EXT-X-CUE-OUT tag will be inserted into the HLS playlist, signaling the start and duration of a placement opportunity to the DAI provider. Based on the cueDuration and the segment length, the #EXT-X-CUE-IN tag will be inserted after the configured duration and the ad opportunity will end, continuing the live stream.

HLS manifest with Cue Out, duration and Cue In tags inserted via the API call above:

#EXTINF:4.0,
    2021-07-09-13-18-34/video/hls/360_500/segment_18188.ts
    #EXT-X-CUE-OUT:60.000
    #EXTINF:4.0,
    2021-07-09-13-18-34/video/hls/360_500/segment_18189.ts
    ...
    #EXTINF:4.0,
    2021-07-09-13-18-34/video/hls/360_500/segment_18203.ts
    #EXT-X-CUE-IN
    #EXTINF:4.0,
    2021-07-09-13-18-34/video/hls/360_500/segment_18204.ts

Want to get started using SCTE-35 in your streaming workflow? Get in touch to let us know how we can help.

Resources

Tutorial: Bitmovin Live Encoding with SCTE-35, HLS and SSAI

Guide: Bitmovin Live Encoding and AWS MediaTailor for SSAI

Guide: Bitmovin Live Encoding with Broadpeak.io for SSAI

SCTE website

SCTE-35 specification

SCTE-35 payload parser

Bitmovin Live Encoding data sheet

The post The Essential Guide to SCTE-35 appeared first on Bitmovin.

144th MPEG Meeting Takeaways: Understanding Quality Impacts of Learning-based Codecs and Enhancing Green Metadata

Christian Timmerer — Sun, 07 Jan 2024 21:01:46 +0000

Preface

Bitmovin has been “Shaping the Future of Video” for over 10 years now and in addition to our own innovations, we’ve been actively taking part in standardization activities to improve the quality of video technologies for the wider industry. I have been a member and attendant of the Moving Pictures Experts Group for 15+ years and have been documenting the progress since early 2010. Recently, we’ve been working on several new initiatives including the use of learning-based codecs and enhancing support for more energy-efficient media consumption.

The 144th MPEG meeting highlights

The 144th MPEG meeting was held in Hannover, Germany! For those interested, the press release with all the details is available. It’s always great to see and hear about progress being made in person.

Attendees of the 144th MPEG meeting in Hannover, Germany.

The main outcome of this meeting is as follows:

MPEG issues Call for Learning-Based Video Codecs for Study of Quality Assessment
MPEG evaluates Call for Proposals on Feature Compression for Video Coding for Machines
MPEG progresses ISOBMFF-related Standards for the Carriage of Network Abstraction Layer Video Data
MPEG enhances the Support of Energy-Efficient Media Consumption
MPEG ratifies the Support of Temporal Scalability for Geometry-based Point Cloud Compression
MPEG reaches the First Milestone for the Interchange of 3D Graphics Formats
MPEG announces Completion of Coding of Genomic Annotations

This post will focus on MPEG Systems-related standards and visual quality assessment. As usual, the column will end with an update on MPEG-DASH.

Visual Quality Assessment

MPEG does not create standards in the visual quality assessment domain. However, it conducts visual quality assessments for its standards during various stages of the standardization process. For instance, it evaluates responses to call for proposals, conducts verification tests of its final standards, and so on.

MPEG Visual Quality Assessment (AG 5) issued an open call to study quality assessment for learning-based video codecs. AG 5 has been conducting subjective quality evaluations for coded video content and studying their correlation with objective quality metrics. Most of these studies have focused on the High Efficiency Video Coding (HEVC) and Versatile Video Coding (VVC) standards. To facilitate the study of visual quality, MPEG maintains the Compressed Video for the study of Quality Metrics (CVQM) dataset.

With the recent advancements in learning-based video compression algorithms, MPEG is now studying compression using these codecs. It is expected that reconstructed videos compressed using learning-based codecs will have different types of distortion compared to those induced by traditional block-based motion-compensated video coding designs. To gain a deeper understanding of these distortions and their impact on visual quality, MPEG has issued a public call related to learning-based video codecs. MPEG is open to inputs in response to the call and will invite responses that meet the call’s requirements to submit compressed bitstreams for further study of their subjective quality and potential inclusion into the CVQM dataset.

Considering the rapid advancements in the development of learning-based video compression algorithms, MPEG will keep this call open and anticipates future updates to the call.

Interested parties are kindly requested to contact the MPEG AG 5 Convenor Mathias Wien (wien@lfb.rwth- aachen.de) and submit responses for review at the 145th MPEG meeting in January 2024. Further details are given in the call, issued as AG 5 document N 104 and available from the mpeg.org website.

Learning-based data compression (e.g., for image, audio, video content) is a hot research topic. Research on this topic relies on datasets offering a set of common test sequences, sometimes also common test conditions, that are publicly available and allow for comparison across different schemes. MPEG’s Compressed Video for the study of Quality Metrics (CVQM) dataset is such a dataset, available here, and ready to be used also by researchers and scientists outside of MPEG. The call mentioned above is open for everyone inside/outside of MPEG and allows researchers to participate in international standards efforts (note: to attend meetings, one must become a delegate of a national body).

Bitmovin and the ATHENA research lab have been working together on ML-based enhancements to boost visual quality and improve QoE. You can read more about our published research in this blog post.

At the 144th MPEG meeting, MPEG Systems (WG 3) produced three news-worthy items as follows:

Progression of ISOBMFF-related standards for the carriage of Network Abstraction Layer (NAL) video data.
Enhancement of the support of energy-efficient media consumption.
Support of temporal scalability for geometry-based Point Cloud Compression (PPC).

ISO/IEC 14496-15, a part of the family of ISOBMFF-related standards, defines the carriage of Network Abstraction Layer (NAL) unit structured video data such as Advanced Video Coding (AVC), High Efficiency Video Coding (HEVC), Versatile Video Coding (VVC), Essential Video Coding (EVC), and Low Complexity Enhancement Video Coding (LCEVC). This standard has been further improved with the approval of the Final Draft Amendment (FDAM), which adds support for enhanced features such as Picture-in-Picture (PiP) use cases enabled by VVC.

In addition to the improvements made to ISO/IEC 14496-15, separately developed amendments have been consolidated in the 7th edition of the standard. This edition has been promoted to Final Draft International Standard (FDIS), marking the final milestone of the formal standard development.

Another important standard in development is the 2nd edition of ISO/IEC14496-32 (file format reference software and conformance). This standard, currently at the Committee Draft (CD) stage of development, is planned to be completed and reach the status of Final Draft International Standard (FDIS) by the beginning of 2025. This standard will be essential for industry professionals who require a reliable and standardized method of verifying the conformance of their implementation.

MPEG Systems (WG 3) also promoted ISO/IEC 23001-11 (energy-efficient media consumption (green metadata)) Amendment 1 to Final Draft Amendment (FDAM). This amendment introduces energy-efficient media consumption (green metadata) for Essential Video Coding (EVC) and defines metadata that enables a reduction in decoder power consumption. At the same time, ISO/IEC 23001-11 Amendment 2 has been promoted to the Committee Draft Amendment (CDAM) stage of development. This amendment introduces a novel way to carry metadata about display power reduction encoded as a video elementary stream interleaved with the video it describes. The amendment is expected to be completed and reach the status of Final Draft Amendment (FDAM) by the beginning of 2025.

Finally, MPEG Systems (WG 3) promoted ISO/IEC 23090-18 (carriage of geometry-based point cloud compression data) Amendment 1 to Final Draft Amendment (FDAM). This amendment enables the compression of a single elementary stream of point cloud data using ISO/IEC 23090-9 (geometry-based point cloud compression) and storing it in more than one track of ISO Base Media File Format (ISOBMFF)-based files. This enables support for applications that require multiple frame rates within a single file and introduces a track grouping mechanism to indicate multiple tracks carrying a specific temporal layer of a single elementary stream separately.

MPEG Systems usually provides standards on top of existing compression standards, enabling efficient storage and delivery of media data (among others). Researchers may use these standards (including reference software and conformance bitstreams) to conduct research in the general area of multimedia systems (cf. ACM MMSys) or, specifically on green multimedia systems (cf. ACM GMSys).

Enhancements to green metadata are welcome and necessary additions to the toolkit for everyone working on reducing the carbon footprint of video streaming workflows. Bitmovin and the GAIA project have been conducting focused research in this area for over a year now and through testing, benchmarking and developing new methods, hope to significantly improve our industry’s environmental sustainability. You can read more about our progress in this report.

MPEG-DASH Updates

The current status of MPEG-DASH is shown in the figure below with only minor updates compared to the last meeting.

MPEG-DASH Status, October 2023.

In particular, the 6th edition of MPEG-DASH is scheduled for 2024 but may not include all amendments under development. An overview of existing amendments can be found in the blog post from the last meeting. Current amendments have been (slightly) updated and progressed toward completion in the upcoming meetings. The signaling of haptics in DASH has been discussed and accepted for inclusion in the Technologies under Consideration (TuC) document. The TuC document comprises candidate technologies for possible future amendments to the MPEG-DASH standard and is publicly available here.

MPEG-DASH has been heavily researched in the multimedia systems, quality, and communications research communities. Adding haptics to MPEG-DASH would provide another dimension worth considering within research, including, but not limited to, performance aspects and Quality of Experience (QoE).

The 145th MPEG meeting will be online from January 22-26, 2024. Click here for more information about MPEG meetings and their developments.

Want to learn more about the latest research from the ATHENA lab and its potential applications? check out this post summarizing the projects from the first cohort of finishing PhD candidates.

Notes and highlights from previous MPEG meetings can be found here.

The post 144th MPEG Meeting Takeaways: Understanding Quality Impacts of Learning-based Codecs and Enhancing Green Metadata appeared first on Bitmovin.

Unlocking the Highest Quality of Experience with Common-Media-Client-Data (CMCD) – What Is It and What Are the Benefits

Daniel Weinberger — Thu, 14 Sep 2023 15:23:14 +0000

As video workflows get more detailed, companies face numerous challenges in delivering a seamless viewing experience to their audiences. One of the biggest hurdles is the ability to make sense of disjointed sets of information from different points in the video delivery workflow. When a client experiences buffering or other playback issues, it can be difficult to pinpoint the root cause within a workflow. Do You rack your brain wondering if it’s a problem with the manifest, the client’s Adaptive Bitrate (ABR) algorithm, or the Content Delivery Network (CDN)? To create a clearer picture for streaming platforms and the CDNs delivering the content, this is where Common-Media-Client-Data (CMCD) comes into play.

What is CMCD and Why is it Important?

CMCD is an open specification and tool developed by the Web Application Video Ecosystem (WAVE) project launched by the Consumer Technology Association (CTA). Its focus is to allow media players to communicate data back to CDNs during video streaming sessions. It provides a standardized protocol for exchanging information between the client and the CDN, bridging the gap between client-side quality of experience (QOE) metrics and server-side quality of service (QOS) data. By providing the transmission of this detailed data and information, CMCD-enabled video streaming services can facilitate better troubleshooting, optimization, and dynamic delivery adjustments by CDNs.

With CMCD, media clients can send key-value pairs of data to CDNs, providing valuable insights into the streaming session. This data includes information such as encoded bitrate, buffer length, content ID, measured throughput, session ID, playback rate, and more. By capturing and analyzing this data, CDNs can gain a deeper understanding of the client’s streaming experience and make informed decisions to improve performance and address any issues.

What data is tracked and how is data sent and processed with CMCD?

The data points for CMCD are thorough, giving you the detailed metrics you need to verify your viewer’s experience along with how to optimize it. The metrics include:

Encoded bitrate
Buffer length
Buffer starvation
Content ID
Object duration
Deadline
Measured throughput
Next object request
Next range request
Object type
Playback rate
Requested maximum throughput
Streaming format
Session ID
Stream type
Startup
Top bitrate

There are three common methods for sending CMCD data from the client to the CDN: custom HTTP request headers, HTTP query arguments, or JSON objects independent of the HTTP request. The choice of method depends on the player’s capabilities and the CDN’s processing requirements and could also differ by platform. In browsers, HTTP query arguments are preferred over HTTP request headers as headers would cause OPTIONS requests in addition to see if the CDN allows the usage of these headers, adding additional round-trip times. Other platforms like Android don’t have this limitation.

It is recommended to sequence the key-value pairs in alphabetical order to reduce the fingerprinting surface exposed by the player. Additionally, including a session ID (sid) and content ID (cid) with each request can aid in parsing and filtering through CDN logs for specific session and content combinations.

The Role of CMCD in Video Streaming Optimization

CMCD plays a crucial role in optimizing video streaming by enabling comprehensive data analysis and real-time adjustments. Combining client-side data with CDN logs, CMCD allows for the correlation of metrics and the identification of issues that affect streaming performance. This holistic view empowers CDNs to take proactive measures to address buffering, playback stalls, or other quality issues.

With CMCD, CDNs can segment data based on Live and Video on Demand (VOD) content, monitor CDN performance, identify specific subscriber sessions, and track the journey of media objects from the CDN to the player and screen. This level of insight enables CDNs to optimize content delivery, manage bandwidth allocation, and ensure a smooth and consistent streaming experience for viewers.

Adoption of CMCD in the Industry

Akamai and Bitmovin CMCD Workflow

The adoption and implementation of CMCD in video workflows are still developing. Many in the video streaming industry are evaluating it at the moment but haven’t made significant moves. However, there are notable players in the market who have taken the lead in incorporating CMCD into their platforms. One such example is Akamai, a prominent CDN provider. Akamai has been actively working on CMCD in collaboration with the Bitmovin Player.

Live Demo

Together, Akamai and Bitmovin have developed a demo presenting the capabilities and benefits of CMCD. The demo shows how CMCD data can be sent by the Bitmovin Player to the CDN.

What are the benefits of CMCD and how can it be implemented on the Bitmovin Player?

As listed above, there are clear benefits to implementing CMCD for video playback. Some of the benefits of CMCD that can be achieved with the Bitmovin player are:

Troubleshooting errors and finding root causes faster
- CMCD makes Player sessions visible in CDN logs so you can trace error sessions through the Player and CDN to quickly find the root cause, reducing the cost associated with users experiencing errors on your platform.
Combine Playback sessions and CDN logs with common session & content identifiers
- Improve your operational monitoring by giving a clearer view of content requests from Player and how those are handled by the CDN.
Improve the quality of experience and reduce rebuffering by enabling pre-fetching
- Through CMCD, the CDN is aware of the Player’s current state and the content it most likely needs next. This allows the CDN to prepare and deliver the next packet the Player needs faster, reducing the time your viewers are waiting.
Integration with Bitmovin’s Analytics
- Monitor every single user session and gain granular data on audience, quality, and ad metrics that ensure a high quality of experience for viewers while helping you pinpoint error sessions rapidly with CMCD data.

As Bitmovin is continuing to explore CMCD’s capabilities, we’ve made it easy to set up and deploy into video workflows through our Github. If you’re wondering how it should be working or want to see it before taking the steps to implement it, you can check out our Bitmovin Web Player Samples.

Additionally, if you have any questions or have any feedback on our experience using it, join our Bitmovin Developer community and comment on the running dialog around our CMCD implementation.

Future Implications and Industry Outlook

While CMCD is still in its early stages of adoption, its potential impact on the video streaming industry is significant. As more embrace CMCD, the ability to gather and analyze comprehensive data will become a standard practice and its benefits will become increasingly evident. This data-driven approach will enable continuous improvements in streaming performance and video workflows. This was a major reason that we at Bitmovin took this project on as transparency is key and CMCD makes the issues easier to find and address, increasing viewer and client satisfaction.

Interest in CMCD will continue to grow with new implementations and use cases, leading the industry to realize the gains from reducing buffering and delivering better, streams to viewers. Our partnership with Akamai is just one step in how we are committed to advancing video streaming technology for content providers and providing a seamless viewing experience for audiences worldwide.

The post Unlocking the Highest Quality of Experience with Common-Media-Client-Data (CMCD) – What Is It and What Are the Benefits appeared first on Bitmovin.

Completing the WebRTC Playback Experience – Enabling Rewind During Real-Time Live Streams

Jacob Arends — Tue, 12 Sep 2023 17:18:27 +0000

Live streaming has solidified its role as a pivotal component of modern video workflows, enabling platforms and media companies to captivate audiences with that sense of witnessing events as they happen. This trend has gained even greater momentum during and after the pandemic, as users craved live experiences that spanned a variety of interests – from sports enthusiasts catching their favorite games to at-home yoga classes on fitness platforms or students enrolling in online courses. To meet this demand, users sought the closest thing to real-time immersion, and this is where WebRTC came into play (pun intended).

What is WebRTC?

WebRTC is an open-source streaming technology that enables real-time data transport, whether that’s video, audio, or other data channels. Initially developed by Google in 2011, it has found widespread adoption in various industries that benefit from real-time communication, such as video conferencing, education, and gaming. The open-source, peer-to-peer technology allows for end-to-end encryption over an HTTPS connection and is compatible with all major browsers and platforms. The use of WebRTC skyrocketed through the use of video conferencing tools that have continued to grow in popularity since the pandemic, as well as in online gaming, where thousands of avid viewers could engage with their favorite content at near real-time latency.

Where does WebRTC fit in the OTT streaming industry?

The OTT streaming industry is currently dominated by 2 streaming protocols: Dash & HLS. However, DASH & HLS are not ideal for achieving lower latency live streaming. Typically, viewers experience between 8-30 seconds of latency due to the need to download segments before playback, meaning the closer to the live edge, the more potential for issues with video buffers and ABR (adaptive bitrate) decisions.

WebRTC takes streaming services a step further by enabling real-time (sub-second latency) streaming experiences. For live events, such as sports or education, it allows services to provide an opportunity for interactivity and contribution without fear of introducing latency in data transfer. Unlike DASH and HLS, WebRTC does not buffer; it prioritizes low latency so viewers can be assured that what they are seeing is happening in real-time.

To summarize the key benefits of WebRTC:

Ultra-Low Latency – WebRTC enables sub-second playback, ideal for live events, online gaming, and other interactive applications.
Cross-Platform Compatibility – WebRTC is supported by all major web browsers and platforms, ensuring broad compatibility and ease of adoption.
End-to-end Encryption – WebRTC incorporates robust security features, including end-to-end encryption, which ensures the privacy and security of communications.
Open Source – WebRTC benefits from a growing developer community that collaborates and innovates to bring continuous improvement to the technology.

However, WebRTC’s benefits come with a drawback: the inability to rewind or start the event from the beginning. This limitation affects many industries and their applications that require content review or replay, particularly in sports, where users want the ability to review and relive key moments.

What industries does WebRTC affect with this issue?

As streaming technology evolves and viewer expectations shift, low and real-time latency become more important, along with the ability to go back and see what they saw a few seconds before. This major playback feature affects many of the industries where real-time streaming is already crucial to the viewer experience, including:

Sports Broadcasting and Betting – Viewers often want to rewatch critical moments, goals, or plays during a live event, which can also affect micro-betting and in-game wagering.

Live selling and auctions – Buyers may want to check what was said about the product or previous items that were listed, requiring the need to browse back through the stream.

Webinars and Conferences – Webinars and virtual conferences may involve important presentations and discussions that can’t be revisited.
Gaming – Fans like to watch gameplay, or players can strategize by rewinding and analyzing previous actions.
Live Events and Performances – Live events, such as concerts or theater performances, need to provide instant replays of key moments or highlights.
Online Education – Students may need to rewind and review parts of a lecture or lesson for better understanding.
Emergency Services and Video Surveillance – Being able to analyze real-time video footage is crucial for making informed decisions and investigations.
Telemedicine – Medical professionals may need to go back to previous portions of a patient’s session to make accurate diagnoses or treatment recommendations.

This list highlights the importance of considering the specific requirements of an application when choosing a streaming technology. To address the replay/rewind issue, Bitmovin and Dolby.io collaborated to build a solution to enable these industries and use cases to dramatically improve the playback experience their viewers want and demand.

How we developed it – Dolby.io x Bitmovin Hackathon Project

During Bitmovin’s quarterly Hackathon in August 2023, Bitmovin engineers partnered with the team at Dolby.io to achieve the following objective:

Create a single live video player experience with real-time streaming and full rewind/review capabilities.

What tools did we use?

Bitmovin’s Player enables countless viewers to experience top-quality playback on all devices across the globe. With its rich feature set, streaming services can deliver their unique experience without compromising on quality.

Bitmovin’s Live Encoder is a resilient live streaming software platform that takes RTMP, SRT, or Zixi inputs and outputs to HLS and DASH for delivery to digital streaming services. Paired with Bitmovin CDN for delivery and storage.

Dolby.io’s Real-time Streaming (formerly Millicast) delivers a WebRTC-based CDN for large-scale streaming that is fast, easy, and reliable for delivering real-time video.

Videon EdgeCaster EZ Encoder is a portable appliance that brings cloud functionality on premises with LiveEdge. In this way, it combines the flexibility of software encoders with the power and reliability of hardware solutions. Regular software updates ensure support for the most advanced features and the latest industry standards.

What did we do?

Workflow diagram showing the source journey from Videon Edgecaster, to Dolby.io & Bitmovin Live Encoder, to Bitmovin Player

Using a Videon Edgecaster to create a dual RTMP output of a live source input, one RTMP output was delivered to Dolby.io’s service to create a real-time WebRTC stream, while the other was delivered to Bitmovin’s Live Encoder to create a standard Live HLS stream.

Dolby.io’s Real-time Streaming service accepts SRT, RTMP, and WHIP/WebRTC, making it easy to convert broadcast-grade streams into WebRTC for sub-second distribution around the globe and at scale.

The stream URLs from both Dolby.io and Bitmovin Live Encoder are now available to the demo page hosting the Bitmovin Player. From here, the player can then choose to load the Dolby.io stream as a WHEP/WebRTC source or the Bitmovin Live Encoder stream as a Live HLS source.

The Bitmovin Player’s open-source UI framework and extensive developer-friendly APIs allow development teams to create unique experiences. So, for the viewer experience, when the user selects the ‘LIVE’ control in the player UI and moves playback to the live edge, they would be viewing the WHEP/WebRTC source from Dolby.io. The user could then drag the timeline marker backward or use the custom “skip” control configured to timeshift back 30 seconds, in which case they would be viewing the live HLS source from the Bitmovin Live Encoder.

This gives the viewer the option to view their content in real-time with full review capability right back to the beginning of the live session. Additionally, by using Dolby.io’s Simulcasting solution, the viewer experience is always at the highest available quality, with advanced ABR logic working for both sources.

Example of how playback on the Bitmovin Player works with Dolby.io

What’s Next?

At Bitmovin, we are currently evaluating official support for WebRTC in the Bitmovin Player. While we’ve been able to address key playback issues, there is room for improvement and clear steps to elaborate on this very successful skunk-works project with Dolby.io. For example, we did not extend the project to use accurate timing information from the segments (like `prft` boxes) or playlists, so the solution could be more accurate and adaptive in understanding where the live edge of the live HLS stream was in comparison to the live encoding time to correctly synchronize with the real-time WebRTC stream. Using the Bitmovin Live Encoder, we could also extend the solution to include live-to-VOD workflows to allow users to watch the replay of a live event after it has ended or even reuse the content while a live event is still running.

Bitmovin and Dolby.io will continue the alliance to address market needs for live workflows where real-time streaming can provide an opportunity for services to enhance their viewers’ experience.

The post Completing the WebRTC Playback Experience – Enabling Rewind During Real-Time Live Streams appeared first on Bitmovin.

143rd MPEG Meeting Takeaways: Green metadata support added to VVC for improved energy efficiency

Christian Timmerer — Tue, 22 Aug 2023 15:11:18 +0000

Preface

Bitmovin is a proud member and contributor to several organizations working to shape the future of video, including the Moving Pictures Expert Group (MPEG), where I along with a few senior developers at Bitmovin are active members. Personally, I have been a member and attendant of MPEG for 20+ years and have been documenting the progress since early 2010. Today, we’re working hard to further improve the capabilities and energy efficiency of the industry’s newest standards, such as VVC, while maintaining and modernizing older codecs like HEVC and AVC to take advantage of advancements in neural network post-processing.

The 143rd MPEG Meeting Highlights

The official press release of the 143rd MPEG meeting can be found here and comprises the following items:

MPEG finalizes the Carriage of Uncompressed Video and Images in ISOBMFF
MPEG reaches the First Milestone for two ISOBMFF Enhancements
MPEG ratifies Third Editions of VVC and VSEI
MPEG reaches the First Milestone of AVC (11th Edition) and HEVC Amendment
MPEG Genomic Coding extended to support Joint Structured Storage and Transport of Sequencing Data, Annotation Data, and Metadata
MPEG completes Reference Software and Conformance for Geometry-based Point Cloud Compression

In this report, I’d like to focus on ISOBMFF and video codecs and, as always, I will conclude with an update on MPEG-DASH.

ISOBMFF Enhancements

The ISO Base Media File Format (ISOBMFF) supports the carriage of a wide range of media data such as video, audio, point clouds, haptics, etc., which has now been further extended to uncompressed video and images.

ISO/IEC 23001-17 – Carriage of uncompressed video and images in ISOBMFF – specifies how uncompressed 2D image and video data is carried in files that comply with the ISOBMFF family of standards. This encompasses a range of data types, including monochromatic and colour data, transparency (alpha) information, and depth information. The standard enables the industry to effectively exchange uncompressed video and image data while utilizing all additional information provided by the ISOBMFF, such as timing, color space, and sample aspect ratio for interoperable interpretation and/or display of uncompressed video and image data.

ISO/IEC 14496-15, formerly known as MP4 file format (and based on ISOBMFF), provides the basis for “network abstraction layer (NAL) unit structured video coding formats” such as AVC, HEVC, and VVC. The current version is the 6th edition, which has been amended to support neural-network post-filter supplemental enhancement information (SEI) messages. This amendment defines the carriage of the neural-network post-filter characteristics (NNPFC) SEI messages and the neural-network post-filter activation (NNPFA) SEI messages to enable the delivery of (i) a base post-processing filter and (ii) a series of neural network updates synchronized with the input video pictures/frames.

Bitmovin has supported ISOBFF in our encoding pipeline and API from day 1 and will continue to do so. For more details and information about container file formats, check out this blog.

Video Codec Enhancements

MPEG finalized the specifications of the third editions of the Versatile Video Coding (VVC, ISO/IEC 23090-3) and the Versatile Supplemental Enhancement Information (VSEI, ISO/IEC 23002-7) standards. Additionally, MPEG issued the Committee Draft (CD) text of the eleventh edition of the Advanced Video Coding (AVC, ISO/IEC 14496-10) standard and the Committee Draft Amendment (CDAM) text on top of the High Efficiency Video Coding standard (HEVC, ISO/IEC 23008-2).

These SEI messages include two systems-related SEI messages, (a) one for signaling of green metadata as specified in ISO/IEC 23001-11 and (b) the other for signaling of an alternative video decoding interface for immersive media as specified in ISO/IEC 23090-13. Furthermore, the neural network post-filter characteristics SEI message and the neural-network post-processing filter activation SEI message have been added to AVC, HEVC, and VVC.

The two SEI messages for describing and activating post-filters using neural network technology in video bitstreams could, for example, be used for reducing coding noise, spatial and temporal upsampling (i.e., super-resolution and frame interpolation), color improvement, or general denoising of the decoder output. The description of the neural network architecture itself is based on MPEG’s neural network representation standard (ISO/IEC 15938 17). As results from an exploration experiment have shown, neural network-based post-filters can deliver better results than conventional filtering methods. Processes for invoking these new post-filters have already been tested in a software framework and will be made available in an upcoming version of the VVC reference software (ISO/IEC 23090-16).

Bitmovin and our partner ATHENA research lab have been exploring several applications of neural networks to improve the quality of experience for video streaming services. You can read the summaries with links to full publications in this blog post.

The latest MPEG-DASH Update

The current status of MPEG-DASH is depicted in the figure below:

The latest edition of MPEG-DASH is the 5th edition (ISO/IEC 23009-1:2022) which is publicly/freely available here. There are currently three amendments under development:

ISO/IEC 23009-1:2022 Amendment 1: Preroll, nonlinear playback, and other extensions. This amendment has been ratified already and is currently being integrated into the 5th edition of part 1 of the MPEG-DASH specification.
ISO/IEC 23009-1:2022 Amendment 2: EDRAP streaming and other extensions. EDRAP stands for Extended Dependent Random Access Point and at this meeting the Draft Amendment (DAM) has been approved. EDRAP increases the coding efficiency for random access and has been adopted within VVC.
ISO/IEC 23009-1:2022 Amendment 3: Segment sequences for random access and switching. This amendment is at Committee Draft Amendment (CDAM) stage, the first milestone of the formal standardization process. This amendment aims at improving tune-in time for low latency streaming.

Additionally, MPEG Technologies under Consideration (TuC) comprises a few new work items, such as content selection and adaptation logic based on device orientation and signaling of haptics data within DASH.

Finally, part 9 of MPEG-DASH — redundant encoding and packaging for segmented live media (REAP) — has been promoted to Draft International Standard (DIS). It is expected to be finalized in the upcoming MPEG meetings.

Bitmovin recently announced its new Player Web X which was reimagined and built from the ground up with structured concurrency. You can read more about it and why structured concurrency matters in this recent blog series.

The next meeting will be held in Hannover, Germany, from October 16-20, 2023. Further details can be found here.

Click here for more information about MPEG meetings and their developments.

Are you currently using the ISOBMFF or CMAF as a container format for fragmented MP4 files? Do you prefer hard-parted fMP4 or single-file MP4 with byte-range addressing? Vote in our poll and check out the Bitmovin Community to learn more.

Looking for more info on streaming formats and codecs? Here are some useful resources:

[Blog] VVC: Benefits, Supported Devices, and Bitmovin’s Implementation
[Blog] Live Low Latency Streaming Tech Deep Dive
[Demo] Low Latency ABR player demo

The post 143rd MPEG Meeting Takeaways: Green metadata support added to VVC for improved energy efficiency appeared first on Bitmovin.

Everything you need to know about Apple’s new Managed Media Source

Daniel Weinberger — Tue, 20 Jun 2023 20:15:26 +0000

At their 2023 Worldwide Developer conference, Apple announced a new Managed Media Source API. This post will explain the new functionality and improvements over prior methods that will enable more efficient video streaming and longer battery life for iOS devices. Keep reading to learn more.

Background and the “old” MSE
New Managed Media Source in Safari 17
Airplay with MMS
Migration from MSE to MMS
Next Steps

Background and the “old” MSE

The first internet videos of the early 2000s were powered by plugins like Flash and Quicktime, separate software that needed to be installed and maintained in addition to the web browser. In 2010, HTML5 was introduced, with its tag that made it possible to embed video without plugins. This was a much simpler and more flexible approach to adding video to websites, but had some limitations. Apple’s HTTP Live Streaming (HLS) made adaptive streaming possible, but developers wanted more control and flexibility than native HLS offered, like the ability to select media or play DRM-protected content. In 2013, the Media Source Extension (MSE) was published by the W3C body, providing a low-level toolkit that gave more control for managing buffering and resolution for adaptive streaming. MSE was quickly adopted by all major browsers and is now the most widely used web video technology…except for on iPhones. MSE has some inefficiencies that lead to greater power use than native HLS and Apple’s testing found that adding MSE support would have meant reducing the battery life, so all the benefits of MSE have been unavailable on iPhone…until now.

New Managed Media Source in Safari 17

With MSE, it can be difficult to achieve the same quality of playback possible with HLS, especially with lower power devices and spotty network conditions. This is partly because MSE transfers most control over the streaming of media data from the User Agent to the application running in the page. But the page doesn’t have the same level of knowledge or even goals as the User Agent, and may request media data at any time, often at the expense of higher power usage. To address those drawbacks and combine the flexibility provided by MSE with the efficiency of HLS, Apple created a new Managed Media Source API (MMS).

Advantages of Managed Media Source over MSE.
Image source: WWDC23 presentation

The new “managed” MediaSource gives the browser more control over the MediaSource and its associated objects. It makes it easier to support streaming media playback on mobile devices, while allowing User Agents to react to changes in memory usage and networking capabilities. MMS can reduce power usage by telling the webpage when it’s a good time to load more media data from the network. When nothing is requested, the cellular modem can go into a low power state for longer periods of time, increasing battery life. When the system gets into a low memory state, MMS may clear out buffered data as needed to reduce memory consumption and keep operations of the system and the app stable. MMS also tracks when buffering should start and stop, so the browser can detect low buffer and full buffer states for you. Using MMS will save your viewers bandwidth and battery life, allowing them to enjoy your videos for even longer.

Airplay with MMS

One of the great things about native HLS support in Safari is the automatic support for AirPlay that lets viewers stream video from their phone to compatible Smart TVs and set top boxes. Airplay requires a URL that you can send, but that doesn’t exist in MSE, making them incompatible. But now with MMS, you can add an HLS playlist to a child source element for the video, and when the user AirPlays your content, Safari will switch away from your Managed Media Source and play the HLS stream on the AirPlay device. It’s a slick way to get the best of both worlds.

Code snippet for adding AirPlay Support with Managed Media Source. Image source: WWDC23 presentation

Migration from MSE to MMS

The Managed Media Source is designed in a backwards compatible way. This means that changing the code from creating a MediaSource object to creating a ManagedMediaSource object after checking if the API is available is the first step:

function getMediaSource() {
    if (window.ManagedMediaSource) {
        return new window.ManagedMediaSource();
    }
    if (window.MediaSource) {
        return new window.MediaSource();
    }

    throw “No MediaSource API available”;
}
const mediaSource = getMediaSource();

As the MMS supports all methods the “old” MSE does, this is all to get you started, but doesn’t unleash the full power of this new API. For that, you need to handle different new events:

mediaSource.addEventListener(“startstreaming”, onStartStreamingHandler);

The startstreaming event indicates that more media data should now be loaded from the network.

mediaSource.addEventListener(“endstreaming”, onStopStreamingHandler);

The endstreaming event is the counterpart of startstreaming and signals that for now no more media data should be requested from the network. This status can also be checked via the streaming attribute on the MMS instance. On devices like iPhones (once fully available) and iPad, requests that follow these two hints benefit from the fast 5G network and allows the device to get into low power mode in between request batches.

In addition, the current implementation also offers hints about a preferred, suggested quality to download. The browser suggests if a high, medium, or low quality should be requested. The user agent may base this on facts like network speed, but also additional details like user settings about enabled data saver modes. This can be read from the MMS instance’s quality property and any change is signaled via an qualitychange event:

mediaSource.addEventListener(“qualitychange”, onQualityChangeHandler);

It remains to be seen if the quality hint will still be available in the future as it offers some risk of fingerprinting.

As the MMS may remove any date range at any given time (as opposed to the MSE’s behavior where this could only happen during the process of appending data), it is strongly recommended to check if the data needed next is still present or needs to be re-downloaded.

Next Steps

Managed Media Source is already available in the current Safari Tech Preview on macOS and Safari 17 on iPadOS 17 beta and can be enabled as an experimental feature on iOS 17 beta. Once generally available on iOS, without being an experimental feature, this will finally bring lots of flexibility and choices to Safari, other browsers, and Apps with WebViews on iOS. It would even be possible to finally support DASH streams on iOS, while keeping web apps power efficient.

Apple has already submitted the proposal of the Managed Media Source API to the World Wide Web Consortium (W3C), which is under discussion and might lead to an open standard other browser vendors could adopt.

Bitmovin will be running technical evaluations to fully explore and understand the benefits of MMS, including how it performs in various real-world environments. We will closely follow the progress from Apple and consider introducing support for MMS into our Web Player SDK once it advances from being an experimental feature on iOS. Stay tuned!

If you’re interested in more detail, you can watch the replay of the media formats section from WWDC23 here and read the release notes for Safari 17 beta and iOS & iPadOS 17 beta. You can also check out our MSE demo code and our blog about developing a video player with structured concurrency.

The post Everything you need to know about Apple’s new Managed Media Source appeared first on Bitmovin.

142nd MPEG Meeting Takeaways: MPEG issues Call for Proposals for Feature Coding for Machines

Christian Timmerer — Wed, 24 May 2023 14:49:18 +0000

Preface

Bitmovin is a proud member and contributor to several organizations working to shape the future of video, none for longer than the Moving Pictures Expert Group (MPEG), where I along with a few senior developers at Bitmovin are active members. Personally, I have been a member and attendant of MPEG for 20+ years and have been documenting the progress since early 2010. Today, we’re working hard to further improve the capabilities and efficiency of the industry’s newest standards, while exploring the potential applications of machine learning and neural networks.

The 142nd MPEG Meeting – MPEG issues Call for Proposals for Feature Coding for Machines

The official press release of the 142nd MPEG meeting can be found here and comprises the following items:

MPEG issues Call for Proposals for Feature Coding for Machines
MPEG finalizes the 9th Edition of MPEG-2 Systems
MPEG reaches the First Milestone for Storage and Delivery of Haptics Data
MPEG completes 2nd Edition of Neural Network Coding (NNC)
MPEG completes Verification Test Report and Conformance and Reference Software for MPEG Immersive Video
MPEG finalizes work on metadata-based MPEG-D DRC Loudness Leveling

In this report, I’d like to focus on Feature Coding for Machines, MPEG-2 Systems, Haptics, Neural Network Coding (NNC), MPEG Immersive Video, and a brief update about DASH (as usual).

Feature Coding for Machines

At the 142nd MPEG meeting, MPEG Technical Requirements (WG 2) issued a Call for Proposals (CfP) for technologies and solutions enabling efficient feature compression for video coding for machine vision tasks. This work on “Feature Coding for Video Coding for Machines (FCVCM)” aims at compressing intermediate features within neural networks for machine tasks. As applications for neural networks become more prevalent and the neural networks increase in complexity, use cases such as computational offload become more relevant to facilitate widespread deployment of applications utilizing such networks. Initially as part of the “Video Coding for Machines” activity, over the last four years, MPEG has investigated potential technologies for efficient compression of feature data encountered within neural networks. This activity has resulted in establishing a set of ‘feature anchors’ that demonstrate the achievable performance for compressing feature data using state-of-the-art standardized technology. These feature anchors include tasks performed on four datasets.

9th Edition of MPEG-2 Systems

MPEG-2 Systems was first standardized in 1994, defining two container formats: program stream (e.g., used for DVDs) and transport stream. The latter, also known as MPEG-2 Transport Stream (M2TS), is used for broadcast and internet TV applications and services. MPEG-2 Systems has been awarded a Technology and Engineering Emmy® in 2013 and at the 142nd MPEG meeting, MPEG Systems (WG 3) ratified the 9th edition of ISO/IEC 13818-1 MPEG-2 Systems. The new edition includes support for Low Complexity Enhancement Video Coding (LCEVC), the youngest in the MPEG family of video coding standards on top of more than 50 media stream types, including, but not limited to, 3D Audio and Versatile Video Coding (VVC). The new edition also supports new options for signaling different kinds of media, which can aid the selection of the best audio or other media tracks for specific purposes or user preferences. As an example, it can indicate that a media track provides information about a current emergency.

Storage and Delivery of Haptics Data

At the 142nd MPEG meeting, MPEG Systems (WG 3) reached the first milestone for ISO/IEC 23090-32 entitled “Carriage of haptics data” by promoting the text to Committee Draft (CD) status. This specification enables the storage and delivery of haptics data (defined by ISO/IEC 23090-31) in the ISO Base Media File Format (ISOBMFF; ISO/IEC 14496-12). Considering the nature of haptics data composed of spatial and temporal components, a data unit with various spatial or temporal data packets is used as a basic entity like an access unit of audio-visual media. Additionally, an explicit indication of a silent period considering the sparse nature of haptics data, has been introduced in this draft. The standard is planned to be completed, i.e., to reach the status of Final Draft International Standard (FDIS), by the end of 2024.

Neural Network Coding (NNC)

Many applications of artificial neural networks for multimedia analysis and processing (e.g., visual and acoustic classification, extraction of multimedia descriptors, or image and video coding) utilize edge-based content processing or federated training. The trained neural networks for these applications contain many parameters (weights), resulting in a considerable size. Therefore, the MPEG standard for the compressed representation of neural networks for multimedia content description and analysis (NNC, ISO/IEC 15938-17, published in 2022) was developed, which provides a broad set of technologies for parameter reduction and quantization to compress entire neural networks efficiently.

Recently, an increasing number of artificial intelligence applications, such as edge-based content processing, content-adaptive video post-processing filters, or federated training, need to exchange updates of neural networks (e.g., after training on additional data or fine-tuning to specific content). Such updates include changes of the neural network parameters but may also involve structural changes in the neural network (e.g., when extending a classification method with a new class). In scenarios like federated training, these updates must be exchanged frequently, such that much more bandwidth over time is required, e.g., in contrast to the initial deployment of trained neural networks.

The second edition of NNC addresses these applications through efficient representation and coding of incremental updates and extending the set of compression tools that can be applied to both entire neural networks and updates. Trained models can be compressed to at least 10-20% and, for several architectures, even below 3% of their original size without performance loss. Higher compression rates are possible at moderate performance degradation. In a distributed training scenario, a model update after a training iteration can be represented at 1% or less of the base model size on average without sacrificing the classification performance of the neural network. NNC also provides synchronization mechanisms, particularly for distributed artificial intelligence scenarios, e.g., if clients in a federated learning environment drop out and later rejoin.

Verification Test Report and Conformance and Reference Software for MPEG Immersive Video

At the 142nd MPEG meeting, MPEG Video Coding (WG 4) issued the verification test report of ISO/IEC 23090-12 MPEG immersive video (MIV) and completed the development of the conformance and reference software for MIV (ISO/IEC 23090-23), promoting it to the Final Draft International Standard (FDIS) stage.

MIV was developed to support the compression of immersive video content, in which multiple real or virtual cameras capture a real or virtual 3D scene. The standard enables the storage and distribution of immersive video content over existing and future networks for playback with 6 degrees of freedom (6DoF) of view position and orientation. MIV is a flexible standard for multi-view video plus depth (MVD) and multi-planar video (MPI) that leverages strong hardware support for commonly used video formats to compress volumetric video.

ISO/IEC 23090-23 specifies how to conduct conformance tests and provides reference encoder and decoder software for MIV. This draft includes 23 verified and validated conformance bitstreams spanning all profiles and encoding and decoding reference software based on version 15.1.1 of the test model for MPEG immersive video (TMIV). The test model, objective metrics, and other tools are publicly available at https://gitlab.com/mpeg-i-visual.

The latest MPEG-DASH Update

Finally, I’d like to provide a quick update regarding MPEG-DASH, which has a new part, namely redundant encoding and packaging for segmented live media (REAP; ISO/IEC 23009-9). The following figure provides the reference workflow for redundant encoding and packaging of live segmented media.

The reference workflow comprises (i) Ingest Media Presentation Description (I-MPD), (ii) Distribution Media Presentation Description (D-MPD), and (iii) Storage Media Presentation Description (S-MPD), among others; each defining constraints on the MPD and tracks of ISO base media file format (ISOBMFF).

Additionally, the MPEG-DASH Break out Group discussed various technologies under consideration, such as (a) combining HTTP GET requests, (b) signaling common media client data (CMCD) and common media server data (CMSD) in a MPEG-DASH MPD, (c) image and video overlays in DASH, and (d) updates on lower latency.

An updated overview of DASH standards/features can be found in the Figure below.

MPEG-DASH Status – April 2023

The next meeting will be held in Geneva, Switzerland, from July 17-21, 2023. Further details can be found here.

Click here for more information about MPEG meetings and their developments.

Have any thoughts or questions about neural networks or the other updates described above? Check out Bitmovin’s Video Developer Community and join the conversation!

Looking for more info on video streaming formats and codecs? Here are some useful resources:

[Guide] The Definitive Guide to Video Codecs
[Tutorial] Encoding VR and 360 video for Meta Quest Headsets
[Blog] The 20 Best Live Streaming Encoders

The post 142nd MPEG Meeting Takeaways: MPEG issues Call for Proposals for Feature Coding for Machines appeared first on Bitmovin.

139th MPEG Meeting Takeaways: MPEG issues Call for Evidence for Video Coding for Machines

Christian Timmerer — Wed, 24 Aug 2022 14:28:00 +0000

Preface

Bitmovin is a proud member and contributor to several organizations working to shape the future of video, none for longer than the Moving Pictures Expert Group (MPEG), where I along with a few senior developers at Bitmovin are active members. Personally, I have been a member and attendant of MPEG for 15+ years and have been documenting the progress since early 2010. Today, we’re working hard to further improve the capabilities and efficiency of the industry’s newest standards, such VVC, LCEVC, and MIV.

The 139th MPEG Meeting – MPEG issues Call for Evidence to drive the future of computer vision and smart transportation

The past few months of research and progression in the world of video standards setting at MPEG (and Bitmovin alike) have been quite busy and though we didn’t publish a quarterly blog for the 138th MPEG meeting, it’s worth sharing again that MPEG was awarded two Technology & Engineering Emmy® Awards for its MPEG-DASH and Open Font Format standards. The latest developments in the standards space have expectedly been focused around improvements to VVC & LCEVC, however, there have also been recent updates made to CMAF and progress with energy efficiency standards and immersive media codecs. I’ve addressed most of the recent updates. The official press release of the 139th MPEG meeting can be found here and comprises the following items:

MPEG Issues Call for Evidence for Video Coding for Machines (VCM)
MPEG Ratifies the Third Edition of Green Metadata, a Standard for Energy-Efficient Media Consumption
MPEG Completes the Third Edition of the Common Media Application Format (CMAF) by adding Support for 8K and High Frame Rate for High Efficiency Video Coding
MPEG Scene Descriptions adds Support for Immersive Media Codecs
MPEG Starts New Amendment of VSEI containing Technology for Neural Network-based Post Filtering
MPEG Starts New Edition of Video Coding-Independent Code Points Standard
MPEG White Paper on the Third Edition of the Common Media Application Format

In this report, I’d like to focus on VCM, Green Metadata, CMAF, VSEI, and a brief update about DASH (as usual).

Video Coding for Machines (VCM)

MPEG’s exploration work on Video Coding for Machines (VCM) aims at compressing features for machine-performed tasks such as video object detection and event analysis. As neural networks increase in complexity, architectures such as collaborative intelligence, whereby a network is distributed across an edge device and the cloud, become advantageous. With the rise of newer network architectures being deployed amongst a heterogenous population of edge devices, such architectures bring flexibility to systems implementers. Due to such architectures, there is a need to efficiently compress intermediate feature information for transport over wide area networks (WANs). As feature information differs substantially from conventional image or video data, coding technologies and solutions for machine usage could differ from conventional human-viewing-oriented applications to achieve optimized performance. With the rise of machine learning technologies and machine vision applications, the amount of video and images consumed by machines has rapidly grown.

Typical use cases include intelligent transportation, smart city technology, intelligent content management, etc., which incorporate machine vision tasks such as object detection, instance segmentation, and object tracking. Due to the large volume of video data, extracting and compressing the feature from a video is essential for efficient transmission and storage. Feature compression technology solicited in this Call for Evidence (CfE) can also be helpful in other regards, such as computational offloading and privacy protection.

Over the last three years, MPEG has investigated potential technologies for efficiently compressing feature data for machine vision tasks and established an evaluation mechanism that includes feature anchors, rate-distortion-based metrics, and evaluation pipelines. The evaluation framework of VCM depicted below comprises neural network tasks (typically informative) at both ends as well as VCM encoder and VCM decoder, respectively. The normative part of VCM typically includes the bitstream syntax which implicitly defines the decoder whereas other parts are usually left open for industry competition and research.

Further details about the CfP and how interested parties are able to respond can be found in the official press release here.

Green Metadata

MPEG Systems has been working on Green Metadata for the last ten years to enable the adaptation of the client’s power consumption according to the complexity of the bitstream. Many modern implementations of video decoders can adjust their operating voltage or clock speed to adjust the power consumption level according to the required computational power. Thus, if the decoder implementation knows the variation in the complexity of the incoming bitstream, then the decoder can adjust its power consumption level to the complexity of the bitstream. This will allow less energy use in general and extended video playback for the battery-powered devices.

The third edition enables support for Versatile Video Coding (VVC, ISO/IEC 23090-3, a.k.a. ITU-T H.266) encoded bitstreams and enhances the capability of this standard for real-time communication applications and services. While finalizing the support of VVC, MPEG Systems has also started the development of a new amendment to the Green Metadata standard, adding the support of Essential Video Coding (EVC, ISO/IEC 23094-1) encoded bitstreams.

Making video coding and systems sustainable and environmentally-friendly will become a major issue in the years to come, specifically since more and more video services become available. However, we need a holistic approach considering all entities from production to consumption and Bitmovin is committed to contribute its share to these efforts.

Third Edition of Common Media Application Format (CMAF)

The third edition of CMAF adds two new media profiles for High Efficiency Video Coding (HEVC, ISO/IEC 23008-2, a.k.a. ITU-T H.265), namely for (i) 8K and (ii) High Frame Rate (HFR). Regarding the former, the media profile supporting 8K resolution video encoded with HEVC (Main 10 profile, Main Tier with 10 bits per colour component) has been added to the list of CMAF media profiles for HEVC. The profile will be branded as ‘c8k0’ and will support videos with up to 7680×4320 pixels (8K) and up to 60 frames per second. Regarding the latter, another media profile has been added to the list of CMAF media profiles, branded as ‘c8k1’ and supports HEVC encoded video with up to 8K resolution and up to 120 frames per second. Finally, chroma location indication support has been added to the 3rd edition of CMAF.

CMAF is an integral part of the video streaming system and enabler for (live) low-latency streaming. Bitmovin and its co-funded research lab ATHENA significantly contributed to enable (live) low latency streaming use cases through our joint solution with Akamai for chunked CMAF low latency delivery as well as our research projects exploring the challenges of real-world deployments and the best methods to optimize those implementations.

New Amendment for Versatile Supplemental Enhancement Information (VSEI) containing Technology for Neural Network-based Post Filtering

At the 139th MPEG meeting, the MPEG Joint Video Experts Team with ITU-T SG 16 (WG 5; JVET) issued a Committee Draft Amendment (CDAM) text for the Versatile Supplemental Enhancement Information (VSEI) standard (ISO/IEC 23002-7, a.k.a. ITU-T H.274). Beyond the SEI message for shutter interval indication, which is already known from its specification in Advanced Video Coding (AVC, ISO/IEC 14496-10, a.k.a. ITU-T H.264) and High Efficiency Video Coding (HEVC, ISO/IEC 23008-2, a.k.a. ITU-T H.265), and a new indicator for subsampling phase indication which is relevant for variable-resolution video streaming, this new amendment contains two Supplemental Enhancement Information (SEI) messages for describing and activating post filters using neural network technology in video bitstreams. This could reduce coding noise, upsampling, colour improvement, or denoising. The description of the neural network architecture itself is based on MPEG’s neural network coding standard (ISO/IEC 15938-17). Results from an exploration experiment have shown that neural network-based post filters can deliver better performance than conventional filtering methods. Processes for invoking these new post-processing filters have already been tested in a software framework and will be made available in an upcoming version of the Versatile Video Coding (VVC, ISO/IEC 23090-3, a.k.a. ITU-T H.266) reference software (ISO/IEC 23090-16, a.k.a. ITU-T H.266.2).

Neural network-based video processing (incl. coding) is gaining momentum and end user devices are becoming more and more powerful for such complex operations. Bitmovin and its co-funded research lab ATHENA investigated and researched such options; recently proposed LiDeR, a lightweight dense residual network for video super resolution on mobile devices that can compete with other state-of-the-art neural networks, while executing ~300% faster.

The latest MPEG-DASH Update

Finally, I’d like to provide a brief update on MPEG-DASH! At the 139^th MPEG meeting, MPEG Systems issued a new working draft related to Extended Dependent Random Access Point (EDRAP) streaming and other extensions which it will be further discussed during the Ad-hoc Group (AhG) period (please join the dash email list for further details/announcements). Furthermore, Defects under Investigation (DuI) and Technologies under Consideration (TuC) have been updated. Finally, a new part has been added (ISO/IEC 23009-9) which is called encoder and packager synchronization for which also a working draft has been produced. Publicly available documents (if any) can be found here.

An updated overview of DASH standards/features can be found in the Figure below.

The next meeting will be face-to-face in Mainz, Germany from October 24-28, 2022. Further details can be found here.

Click here for more information about MPEG meetings and their developments.

Have any questions about the formats and standards described above? Do you think MPEG is taking the first step toward enabling Skynet and Terminators by advancing video coding for machines? Check out Bitmovin’s Video Developer Community and let us know your thoughts.

Looking for more info on streaming formats and codecs? Here are some useful resources:

[E-Book] Ultimate Guide to Container Formats
[Blog] Live Low Latency Streaming Tech Deep Dive
[Guide] Practical Guide to HDR

The post 139th MPEG Meeting Takeaways: MPEG issues Call for Evidence for Video Coding for Machines appeared first on Bitmovin.

A Brief History of MPEG-DASH: From Early Development to Emmy® Award Win

Christian Timmerer — Thu, 19 May 2022 09:59:00 +0000

Video streaming is ubiquitous. It permeates every aspect of our lives. We watch viral videos on TikTok, attend work conferences via Zoom, use it to supplement our education at academic institutions and even use it to work up a sweat via connected gym equipment. Netflix had a transformative impact on how we access our favorite content by delivering it over the Internet and providing consumers with the flexibility to watch their favorite films and TV shows from anywhere and on any device. What makes the impact of video streaming on our day-to-day lives even more astonishing is that it’s still a nascent industry that only took off at the turn of the century. However, it wouldn’t have advanced so quickly without the Moving Picture Experts Group (MPEG), which recently won a Technology & Engineering Emmy® Award for its groundbreaking MPEG-DASH standard.
The development of MPEG-DASH began in 2010 when the likes of YouTube and Netflix laid the framework for the popularization of video streaming among consumers. However, the quality of streams was often sub-par and plagued with stalls, buffering, missing/wrong plug-ins, and poor image quality. MPEG-DASH aimed to create a new video streaming standard to deliver high-quality streams to users with minimal issues. MPEG-DASH uses adaptive bitrate technology to break down videos into smaller chunks and encode them at different quality levels. Adaptive bitrate streaming detects the user’s bandwidth in real-time and adjusts the quality of the stream.
MPEG-DASH was standardized in 2012, and it is the first adaptive bitrate streaming solution that is an international standard. What makes MPEG-DASH groundbreaking is that it allows internet-connected devices to receive high-quality streams, regardless of bandwidth quality. Its standardization was significant because it gave the industry confidence that it could universally adopt its capabilities compared to proprietary solutions. Furthermore, the fact it is codec agnostic means content can be encoded with any encoding format – making it possible for the entire media industry to improve the quality of their streams. The first live MPEG-DASH demonstration took place in August 2012. VRT offered its audience the chance to experience the Olympic Games broadcast on their devices via the newly standardized streaming standard.
The impact of MPEG-DASH is far-reaching and completely transformed the entire video streaming industry, including on-demand, live and low latency streaming – even 5G. It’s relied on by Hulu, Netflix and YouTube to empower them to deliver superior viewing experiences and accounts for more than 50% of the world-wide internet traffic today. Currently, MPEG is working on its 5th edition to address and meet the needs of the constantly evolving video streaming ecosystem and ensure its compatibility with new technologies.
MPEG-DASH is also deeply embedded in the DNA of Bitmovin, which was founded in 2013 and provided the springboard for the company’s success. MPEG-DASH was co-created by my fellow Bitmovin co-founders, Stefan Lederer and Chris Mueller, which sparked the development of the Bitmovin Player and Bitmovin Encoder – the first commercial solutions made for this video streaming standard. Bitmovin’s solutions were, and continue to be, backed by strong academic research, and it is one of the primary drivers behind our rapid growth. We have outpaced our competitors in under ten years and become the category leader for video streaming infrastructure. The competitiveness of our solutions is exemplified by the fact we are powering the world’s largest OTT online video providers, including the BBC, ClassPass, discovery+, Globo, The New York Times and Red Bull Media House many more.
MPEG’s Technology & Engineering Emmy® Award win is the culmination of years of hard work dedicated to optimizing video streams and providing audiences worldwide with superior viewing experiences. MPEG has been instrumental in some of the most significant technological advancements in the video streaming ecosystem. It is a fantastic achievement for the team, comprising over 90 researchers and engineers from around 60 companies worldwide, to receive this tremendous accolade. Congratulations again to the team!

The post A Brief History of MPEG-DASH: From Early Development to Emmy® Award Win appeared first on Bitmovin.

MPEG-DASH – Bitmovin

The Bitmovin Innovators Network “Better Together” Award Winners!

Accenture – Global Systems Integrator of the Year:

Broadpeak – Global ISV Partner of the Year:

MediaKind – Global Service Provider Partner of the Year:

Microsoft Azure Marketplace – Cloud Marketplace of the Year:

Nomad Media – Americas Regional Channel Partner of the Year:

G&L Geißendörfer & Leschinsky – EMEA Regional Channel Partner of the Year:

Viet Communications – APAC Regional Channel Partner of the Year

The Essential Guide to SCTE-35

Everything you need to know about SCTE-35, the popular event signaling standard that powers dynamic ad insertion, digital program insertion, blackouts and more for TV, live streams and on-demand video.

Table of Contents

What is SCTE?

What is SCTE-35?

SCTE-35 markers and their applications for streaming video

Use cases and benefits of SCTE-35

Types of SCTE-35 markers

splice_insert commands

time_signal commands

Using SCTE-35 markers in streaming workflows

MPEG-2 transport streams

HLS

MPEG-DASH

Bitmovin Live Encoding SCTE Support

SCTE message pass-through and processing

Splice Decisions

Live cue point insertion API

Resources

144th MPEG Meeting Takeaways: Understanding Quality Impacts of Learning-based Codecs and Enhancing Green Metadata

Table of Contents

Preface

The 144th MPEG meeting highlights

Visual Quality Assessment

MPEG Systems-related Standards

MPEG-DASH Updates

Unlocking the Highest Quality of Experience with Common-Media-Client-Data (CMCD) – What Is It and What Are the Benefits

What is CMCD and Why is it Important?

What data is tracked and how is data sent and processed with CMCD?

The Role of CMCD in Video Streaming Optimization

Adoption of CMCD in the Industry

Live Demo

What are the benefits of CMCD and how can it be implemented on the Bitmovin Player?

Future Implications and Industry Outlook

Completing the WebRTC Playback Experience – Enabling Rewind During Real-Time Live Streams

What is WebRTC?

Where does WebRTC fit in the OTT streaming industry?

What industries does WebRTC affect with this issue?

How we developed it – Dolby.io x Bitmovin Hackathon Project

What tools did we use?

What did we do?

What’s Next?

143rd MPEG Meeting Takeaways: Green metadata support added to VVC for improved energy efficiency

Table of Contents

Preface

The 143rd MPEG Meeting Highlights

ISOBMFF Enhancements

Video Codec Enhancements

The latest MPEG-DASH Update

Everything you need to know about Apple’s new Managed Media Source

Background and the “old” MSE

New Managed Media Source in Safari 17

Airplay with MMS

Migration from MSE to MMS

Next Steps

142nd MPEG Meeting Takeaways: MPEG issues Call for Proposals for Feature Coding for Machines

Preface

The 142nd MPEG Meeting – MPEG issues Call for Proposals for Feature Coding for Machines

Feature Coding for Machines

9th Edition of MPEG-2 Systems

Storage and Delivery of Haptics Data

Neural Network Coding (NNC)

Verification Test Report and Conformance and Reference Software for MPEG Immersive Video

The latest MPEG-DASH Update

139th MPEG Meeting Takeaways: MPEG issues Call for Evidence for Video Coding for Machines

Preface

The 139th MPEG Meeting – MPEG issues Call for Evidence to drive the future of computer vision and smart transportation

Video Coding for Machines (VCM)

Green Metadata

Third Edition of Common Media Application Format (CMAF)

New Amendment for Versatile Supplemental Enhancement Information (VSEI) containing Technology for Neural Network-based Post Filtering