low-latency – Bitmovin

Low Latency vs. Target Latency: Why there isn’t always a need for speed

Igor Oreper — Wed, 04 Jan 2023 10:14:59 +0000

Low latency has been a hot topic in video streaming for a while now. It’s the trendy keyword you hear at every trade show throughout the year; it even ranks high in our annual Video Developer Report as one of the biggest headaches for brands to achieve or the one they are very interested in deploying.

However, despite the huge amount of conversations low latency generates, it’s also one of the most difficult terms to define. This is because each use case could require playback delay to be in the hundreds of milliseconds (ultra-low latency) to a span of 1-5 seconds (low latency), meaning your perception of what low latency is can differ significantly from another’s point of view. Additionally, low latency is limiting because there is a high probability you’re sacrificing quality for a fast video startup time. You are also likely to pay higher prices and work with multiple vendors to get the specific hardware or software you need to facilitate each step of the video workflow.

Furthermore, not every video streaming service needs low latency, even though it’s constantly requested by startups to enterprise-level businesses. A better question may not be “How do I minimize my live stream’s delay?” but “What is the target latency I want my audience to have?” Target latency is a feature not mentioned often and one we will explore in this blog, as it can make a world of difference to the playback experience you’re offering your viewers.

Back to basics – What are Low and Target Latency, and how are they achieved?

If you are unfamiliar with low latency, it essentially refers to minimizing the delay between a live on-site production of an event and a specific viewer watching it over the Internet. Standard HLS and DASH streams have a delay of 8 to 30 seconds, depending on stream settings and a particular viewer’s streaming environment (e.g., the protocol used, buffer size, bandwidth connection, device, and location). For a stream to be considered low latency, it can’t have more than 5 seconds of broadcast delay, with some workflows needing as low as a few hundred milliseconds for ultra-low latency, as stated above. There are several ways to achieve this very low broadcast delay, each with its benefits and costs. However, all methods available in the market today are not standardized, and they all require each piece of your video supply chain to support a chosen low-latency streaming technology, from the live encoder and packager to the CDN and player. This is important as it drives costs and limits your flexibility in selecting a best-of-breed technology stack.

On the other hand, target latency is a predefined time delay so the entire audience can watch the same stream simultaneously. The stream is not affected by the likely differences between individual viewers’ circumstances, meaning that everyone in that group can experience the same live event at the same time or very close to it. This stream synchronization can be achieved by choosing a specific buffer size across the target audience and managing playback to a target delay while attempting to cater to viewers who represent the lowest common denominator (e.g., slowest to fill buffer). You can set the target latency directly in the Bitmovin Player using the targetLatency property, enabling you to design the user experience as you want.

How do both affect the viewer experience?

The benefits of low latency revolve around getting viewers their content fast, similar to broadcast speeds, which helps make them feel more connected with the live event. Live sports is an excellent example of where low latency plays a prominent role in the viewing experience. It helps combat the “Noisy Neighbor Effect,” where your audience can be negatively affected when seeing notifications or hearing cheers from neighbors when something happens before they see it on their screen. This also applies to real-time betting, which requires a stream to be available in ultra-low latency to see real-time results. Low latency is also critical for live seminars, esports, fitness classes, and many other live interactive use cases to help keep your audience engaged and up-to-date with what’s happening at that moment.

The biggest downside of the available low latency solutions is that they do not permit players to buffer enough content, which leads to playback interruptions when streaming conditions are less than ideal (e.g., poor wifi, an ISP problem, device performance). This alone can quickly lead to slow video starts, rebuffering, decreased stream quality, and other performance issues, creating a terrible experience for the user.

Any video streaming service can use target latency in a way that minimizes any downside to the viewer experience. This is because you can set the delay for a consistent experience for the entire audience or for a predefined audience, ensuring your viewers will have a better quality of experience due to increased stream stability and control during playback. For example, if you offer a second screen experience like a chat feature within the live event, target latency will keep everyone at the same live point so that it feels more like a live event. The only potential downside of the target latency solution is for viewers who may be using different video streaming services, which may cause them to be at different live points relative to their neighbors.

What does this mean for a business’s bottom line?

Pricing concerns are one of the top priorities when evaluating what is best for your business. From each part of your setup to the encoding and bandwidth requirements, low-latency workflows have the potential to be more expensive. This is because each component of your video supply chain must support the low-latency streaming technology you choose and can potentially expand to multiple ones if you’re offering low-latency streaming across different platforms (e.g., iOS and Android). Due to the complexity, it can take numerous vendors and a lot of integration for you to achieve low latency needs. This is a fundamental challenge as high costs inevitably limit realizing these capabilities, especially in tough economic times.

Target latency, on the other hand, requires only client-side software changes, so implementation and operational costs are relatively low, as you won’t need to buy and integrate specialized components.

Wrapping up

Reduced latency of 8-10 seconds is already achievable for most video streaming services today using standardized HLS and DASH protocols which already support a broad range of devices compared to (ultra) low latency solutions. Video streaming services should carefully consider the real-world pros and cons of (ultra) low latency vs. target latency solutions as they continue to push the limits in delivering the best viewer experience to their audiences.

The post Low Latency vs. Target Latency: Why there isn’t always a need for speed appeared first on Bitmovin.

Video Tech Deep-Dive: Live Low Latency Streaming Part 3 – Low-Latency HLS

Jameson Steiner — Mon, 10 Aug 2020 09:50:09 +0000

This blog post is the final piece of our Live Low-Latency Streaming series, where we previously covered the basic principles of low-latency streaming in OTT and LL-DASH. This final post focuses on latency when using Apple’s HTTP Live Streaming (HLS) protocol and how the latency time can be reduced. This article assumes that you are already familiar with the basics of HLS and its manifest/playlist mechanics. You can view the first two posts below:

Why is latency high in HLS?

HLS in its current specifications favors stream reliability over latency. Higher latency is accepted in exchange for stable playback without interruptions. In section 6.3.3. Playing the Media Playlist File the HLS specification states that a playback client

SHOULD NOT choose a segment that starts less than three target durations from the end of the playlist file

Honoring this requirement results in having a latency of at least 3 target durations. Given typical target durations for current HLS deployments of 10 or 6 seconds, we would end up with a latency of at least 30 or 18 seconds, which is far from low. Even if we choose to ignore the above requirement, the fact that segments are typically produced, transferred, and consumed in their entirety poses a high risk of buffer underruns and subsequent playback interruptions, as described in more detail in the first part of this blog series.
The HLS media playlist for the above depicted this live stream would look something like this:
[bg_collapse view=”button-blue” color=”#f7f7f7″ icon=”eye” expand_text=”View HLS media playlist” collapse_text=”Close HLS media playlist” ]

[/bg_collapse]

Road to Low-Latency HLS

2017’s Periscope, the most popular platform for live streaming of user-generated content at the time, investigated streaming solutions to replace their RTMP- and HLS-based hybrid approach with a more scalable one. The requirement was to offer similar end-to-end latency as RTMP but in a more cost-effective way; considering that their use case was streaming to large audiences. Periscope presented their solution to high latency issues: which took Apple’s HLS protocol, made two fundamental changes and called it Low-Latency HLS (LHLS):

Segments are delivered using HTTP/1.1 Chunked Transfer Coding
Segments are advertised in the HLS playlist before the are available

If you read our previous blog posts about Low-Latency streaming, you might recognize these simple concepts as being the key ingredients for today’s OTT-based Low-Latency streaming approaches, like LL-DASH. Periscope’s work likely sparked and influenced the following developments around low-latency streaming such as LL-DASH and a community-driven initiative for defining modifications to HLS aiming to reduce streaming latency that started at the end of 2018.
The core of the community proposal for LHLS was the same as the aforementioned concepts. Segments should be loaded in chunks using HTTP CTE and early availability of incomplete segments should be signaled using a new #EXT-X-PREFETCH tag in the playlist. In the example below, the client can already load and consume the currently available data of 6.ts and continue to do so as the chunks become available over time. Furthermore, the request for the segment 7.ts can be made early on to save network round-trip time, even though production had not started yet. It is also worth mentioning that the LHLS proposal preserves full backward-compatibility allowing standard HLS clients to consume such streams. This was the gist of the proposed implementation; you can find the full proposal in the hlsjs-rfcs GitHub repository.
[bg_collapse view=”button-blue” color=”#f7f7f7″ icon=”eye” expand_text=”View LHLS media playlist proposal” collapse_text=”Close LHLS media playlist proposal” ]

[/bg_collapse]
Individuals across several companies in the media industry came together to work on this proposal with the hope that also Apple, being the driving force behind HLS, would join in and work the proposal into the official HLS specification. However, things came to fruition very differently than expected as Apple presented its own preliminary version, a very different approach during their 2019’s Worldwide Developers Conference.
Despite it being (and staying) a proprietary approach, some companies, like Twitch, are successfully using it in their production systems.

Apple’s Low-Latency HLS

In this section we’ll cover the principles of Apple’s preliminary specification for Low-Latency HLS.

Generation of Partial Media Segments

While HLS content is split into individual segments, in low-latency HLS each segment further consists of parts that are independently addressable by the client. For example, a segment of 6 seconds can consist of 30 parts of 200ms duration each. Depending on the container format, such parts can represent CMAF chunks or a sequence of TS packets. This partitioning of segments decouples the end-to-end latency from the long segment duration and allows the client to load parts of a segment as soon as they become available. Compared to LL-DASH, this is achieved by using HTTP CTE, however, the MPD does not advertise individual parts/chunks of segments.
[bg_collapse view=”button-blue” color=”#f7f7f7″ icon=”eye” expand_text=”View partial media segment generation in low latency HLS” collapse_text=”Close partial media segment generation in low latency HLS” ]

[/bg_collapse]
Partial segments are advertised using a new EXT-X-PART tag. Note that partial segments are only advertised for the most recent segments in the playlist. Furthermore, the partial segments (filePart272.x.mp4) and the respective full segments (fileSequence272.mp4) are offered.
Partial segments can also reference the same file but at different byte ranges. Clients can thereby load multiple partial segments with a single request and save round-trips compared to making separate requests for each part (as seen below).

Preload hints and blocking of Media downloads

Soon to be available partial segments are advertised prior to their actual availability in the playlist by a new EXT-X-PRELOAD-HINT tag. This enables clients to open a request early and the server will respond once the data becomes available. This way the client can “save” the round-trip time for the request.

Playlist Delta Updates

Clients have to refresh HLS playlists more frequently for low-latency HLS. Playlist Delta Updates can be used to reduce the amount of data transferred for each playlist request. A new EXT-X-SKIP tag replaces the content of the playlist that the client already received with a previous request.

Blocking of Playlist reload

The discovery of new segments becoming available for an HLS live stream is usually applied by the client reloading the playlist file in regular intervals and checking for new segments being appended. In the case of low-latency streaming, it is desirable to avoid any delay from a (partial) segment becoming available in the playlist to the client discovering its availability. With the playlist reloading approach, such discovery delay can be as high as the reload time interval in the worst case.
With the new feature of blocking playlist reloads, clients can specify which future segment’s availability they are awaiting and the server will have to hold onto that playlist request until that specific segment becomes available in the playlist. The segment to be awaited for is specified using a query parameter on the playlist request.

Rendition Reports

When playing at low latencies, fast bitrate adaptation is crucial to avoid playback interruptions due to buffer underruns. To save round-trips during playlist switching, playlists must contain rendition reports via a new EXT-X-RENDITION-REPORT tag that informs about the most recent segment and part in the respective rendition.

Conclusion

For more detailed information on Apple’s low-latency HLS take a look at the Preliminary Specification and the latest IEFT draft containing low-latency extensions for HLS.
We can conclusively say that low-latency HLS increases complexity quite significantly compared to standard HLS. The server will have its responsibilities expanded, from simply serving segments to supporting several additional mechanisms that clients use to save network round-trips and speed up segment delivery which ultimately enables lower end-to-end latency. Considering that the specification remains subject to change and is yet to be finalized, it might still take a while until streaming vendors pick it up and we finally see low-latency HLS in the wild. In short, live low latency streaming using HLS is possible, but at a large cost to server complexity, there are measures being developed to reduce complexity and server load, but it’ll take wider spread adoption by major stream providers for this to happen.

The post Video Tech Deep-Dive: Live Low Latency Streaming Part 3 – Low-Latency HLS appeared first on Bitmovin.

Video Tech Deep-Dive: Live Low Latency Streaming Part 2

Jameson Steiner — Thu, 25 Jun 2020 12:42:01 +0000

This blog post is continuation of an ongoing blog and webinar technical deep series. You can find the first blog post here. The first post covered the fundamentals of live low latency and defined chunked delivery methods with CMAF.
This blog post expands on chunked CMAF delivery by explaining it’s application with MPEG-DASH to achieve low latency. We’ll lay some foundations and cover the basic approaches behind low-latency DASH, then look into what future developments are expected as low-latency streaming is a heavily researched subject and is quickly becoming a media industry standard.

Basics of MPEG-DASH Live Streaming

Before diving into how Low Latency Streaming works in MPEG-DASH we first need to understand some basic stream mechanics of DASH live streams, most importantly, the concept of segment availability.
The DASH Media Presentation Description (MPD) is an XML document containing essential metadata of a DASH stream. Among many other things, it describes which segments a stream consists of and how a playback client can obtain them. The main difference between on-demand and live stream segments within DASH is that all segments of the stream are available at all times for on-demand; whereas the segments are produced continuously one after another as time progresses for live-streams. Every time a new segment is produced, its availability is signaled to playback clients through the MPD. It is important to note that a segment is only made available once it is fully encoded and written to the origin.

Fig. 1 Live stream with template-based addressing scheme (simplified)

The MPD would specify the start of the stream availability (i.e. the Availability Start Time) and a constant segment duration, e.g. 2 seconds. Using these values the player can calculate how many segments are currently in the availability window and also their individual availability start time. For example, the segment availability start time for the second segment would be AST + segment_duration * 2.

Low Latency Streaming with MPEG-DASH

In the first part of this blog post series, we described how chunked encoding and transfer enables partial loads and consumption of segments that are still in the process of being encoded. To make a player aware of this action, the segment availability in the MPD is adjusted to signal an earlier availability, i.e. when the first chunk is complete. This is done using the availabilityTimeOffset in the MPD. As a result, the player will not wait for a segment to be fully available and will load and consume it earlier.
Consider the example of Fig.1 with a segment duration of 2 seconds and a chunk duration of 0.033 seconds (i.e. one video frame duration with 29.97 fps). To signal the segment availability once the first chunk is completed we would set the availabilityTimeOffset to 1.967 seconds (segment_duration – chunk_duration). This would signal the greyed-out segment in Fig. 1 to become partially available.
The below MPD represents this example:



  xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xmlns="urn:mpeg:dash:schema:mpd:2011"
  xmlns:xlink="http://www.w3.org/1999/xlink"
 xsi:schemaLocation="urn:mpeg:DASH:schema:MPD:2011 http://standards.iso.org/ittf/PubliclyAvailableStandards/MPEG-DASH_schema_files/DASH-MPD.xsd"
  profiles="urn:mpeg:dash:profile:isoff-live:2011"
  type="dynamic"
  minimumUpdatePeriod="PT500S"
  suggestedPresentationDelay="PT2S"
  availabilityStartTime="2019-08-20T05:00:03Z"
  publishTime="2019-08-20T12:42:07Z"
  minBufferTime="PT2.0S">
  
    
      contentType="video"
      segmentAlignment="true"
      bitstreamSwitching="true"
      frameRate="30000/1001">
      
       id="0"
       mimeType="video/mp4"
       codecs="avc1.64001f"
       bandwidth="2000000"
       width="1280"
       height="720"
        
         timescale="1000000"
         duration="2000000"

availabilityTimeOffset=”1.967″

         initialization="1566277203/init-stream$RepresentationID$.m4s"
         media="1566277203/chunk-stream_t_$RepresentationID$-$Number%05d$.m4s"
         startNumber="1">

To recap, for low-latency DASH we are mainly doing two things:

Chunked encoding and transfer (i.e. chunked CMAF)
Signaling early availability of in-progress segments

While the previous approach enables a basic low-latency DASH setup, there are additional considerations to be made to further optimize and stabilize streaming experience. The DASH Industry Forum is working on guidelines for low-latency DASH to be released in the next version of the DASH-IF Interoperability Points (DASH-IF IOP) – expected in early July 2020. The change request for that can be found here. The following will explain key parts of these guidelines. Please note that some features were not officially finalized and standardized at the time of this post’s publication (June 2020).

Wallclock Time Mapping

For the purpose of measuring latency, a mapping between the media’s presentation time and the wall-clock time is needed. This is so that for any given presentation time of the stream the corresponding wall-clock time is known. The latency for a given playback position can then be calculated by determining the corresponding wall-clock time and subtracting it from the current wall-clock time.
This mapping can be achieved by specifying a so-called Producer Reference Time either in the segments (i.e. inband as prft box) or in the MPD. It essentially specifies the wallclock time at which the respective segment/chunk was produced. (as seen below)


  id="0"
  type="encoder"
  presentationTime="538590000000"
  wallclockTime="2020-05-19T14:57:45Z">

The type attribute specifies whether the reference time was set by the capturing device or the encoder. Allowing for calculation of the End-to-End Latency (EEL) or Encoder-Display Latency (EDL), respectively.

Client Time Synchronization

A precise time/clock at the playback client is necessary for calculations that involve the client’s wallclock time such as segment availability calculations and latency calculations. It is recommended for the MPD to include a UTCTiming element which specifies a time source that can be used to adjust for any drift of the client clock. (as seen below)


  schemeIdUri="urn:mpeg:dash:utc:http-iso:2014"

  value="https://time.akamai.com/?iso"

/>

Low Latency Service Description

A ServiceDescription element should be used to specify the service provider’s desired target latency and minimum/maximum latency boundaries in milliseconds. Furthermore, playback rate boundaries may be specified that define the allowed range for playback acceleration/deceleration by the playout client to fulfill the latency requirements.

In most player implementations such parameters are provided externally using configurations and APIs.

Resynchronization Points

The previous post pointed out that chunked delivery decouples the achievable latency from the segment durations and enables us to choose relatively long segment durations to maintain good video encoding efficiency. In turn, this prevents fast quality adaptation of the player as quality switching can only be done on segment boundaries. In a low-latency scenario with low buffer levels, fast adaptation — especially down-switching — would be desirable to avoid buffer underruns and consequently playback interruptions.
To that end, Resync elements may be used that specify segment properties like chunk duration and chunk size. Playback clients can utilize them to locate resync point and

Join streams mid-segment, based on latency requirements
Switch representations mid-segment
Resynchronize at mid-segment position after buffer underruns

The previous was a glimpse of what to expect in the near future and shows the great effort of the media industry put into kick-starting low-latency streaming with MPEG-DASH and getting it ready for production services.
Want to learn more? Check out Part 3: Video Tech Deep-Dive: Live Low Latency Streaming Part 3 – Low-Latency HLS
… or take a look at some of the supporting documentation below:
[Tool] DASH-IF Conformance Tool
[Blog Post] Video Tech Deep-Dive: Live Low Latency Streaming Part 1
[Demo] Low Latency Streaming with Bitmovin’s Player

The post Video Tech Deep-Dive: Live Low Latency Streaming Part 2 appeared first on Bitmovin.

2019 Video Developer Report – The Future of Video: AV1 Codec, AI & Machine Learning, and Low Latency

Stefan Lederer — Fri, 06 Sep 2019 05:21:23 +0000

Download now: The 2019 Video Developer Report reveals the immense growth opportunities today’s developers must evaluate and some of the major challenges they face.

In its third year, the Bitmovin Video Developer Report provides key insights into the evolving technology trends of the digital video industry; we’ve also revamped the look and feel of the report! This year, Bitmovin’s report resulted in 542 participants from 108 different countries, a major increase from previous years! This report acts as a handy reference for how the video streaming industry is shaped by consumer demands and technology challenges.
The insights from the gathered data reveal a combination of specific trends in video technology usage and a holistic picture of what developers are hoping (and planning) for in the coming year. This report identified that AV1, Artificial Intelligence & Machine Learning, and Low Latency are the latest trends in video development, but why? And how! The 2019 Video Developer report covers these hot topics with excellent insights and recommendations for the future of the industry.
The report starts with an overview of various challenges that modern developers are facing, as well the types of encoding solutions that they implement to help alleviate such problems. This is followed by an insightful look into Video Codecs, Streaming Formats, and the current/future implementations of AI & ML in video infrastructure.
That will transition into player insights, as well as a close look into platform/device usage – broken out by platform/device type and regional responses. To tie all player-oriented information together, the report continues with Monetization models, Digital Rights Management, and the ad structure formats that developers are most sought out by industry experts.
Much like in the video development world, all the content is concluded with a section focusing on performance metrics and analytics models that tell a numerical story of a campaign or projects success!
Now! Here’s a sneak peak into the results of the 2019 Video Developer Report:

Low latency and device playback are still major concerns for video developers

Latency or broadcast delay is the biggest problem being experienced with video technology today, according to over half of global respondents (54 percent). Delivery delays can be a particular pain point for online streaming compared to traditional broadcasters, especially for live sports events.
The next most prevalent issue is ensuring playback on all devices, identified as an issue for 41 percent of global respondents, a nine percent drop-off from the previous years report (50 percent).
The widespread challenges faced by respondents indicate that developers are tasked with an immensely complex set of responsibilities that attempt to deliver and protect high-quality video, at lower costs.

Codecs: The few dominate the many – H.264/AVC remains atop, AAC (audio) controls the market

AV1 has maintained consistent growth in interest with Planned usage set to triple in the coming year. One in five (20 percent) expect to start using AV1 in the coming year. While H.264/AVC maintained a 91 percent usage rate; some respondents indicate that they plan to implement it on other new projects in the following year (resulting in >100% total responses for the category).

Which video codecs are you currently using and planning to implement within the next 12 months?

Artificial Intelligence & Machine Learning are here to stay

One of the largest topics of discussion surrounding video technologies today is the implementation of Artificial Intelligence and/or Machine Learning in video workflows. 56 percent of this year’s respondents indicated that they expect to implement AI/ML-based video workflow solutions within the next two years. We believe that this is a topic that needs to be closely monitored and tested, as this number is predicted to increase. In fact, Bitmovin is already testing new AI/ML-based solutions today.
We had a lot of fun digging into the numbers this year and we are thankful to the many developers that took our survey and making this report possible. Download the full report for a more detailed and insightful analysis of our findings relative to Low Latency, Codecs, AI/ML, and even Digital Rights Management! Other topics covered include: encoding infrastructures, video monetization business models, and video analytics implementations. You can find the full report with all of our findings at the following link:

Video technology guides and articles

Back to Basics: Guide to the HTML5 Video Tag
What is a VoD Platform? A comprehensive guide to Video on Demand (VOD)
Video Technology [2022]: Top 5 video technology trends
HEVC vs VP9: Modern codecs comparison
What is the AV1 Codec?
Video Compression: Encoding Definition and Adaptive Bitrate
What is adaptive bitrate streaming
MP4 vs MKV: Battle of the Video Formats
AVOD vs SVOD; the “fall” of SVOD and Rise of AVOD & TVOD (Video Tech Trends)
MPEG-DASH (Dynamic Adaptive Streaming over HTTP)
Container Formats: The 4 most common container formats and why they matter to you.
Quality of Experience (QoE) in Video Technology [2022 Guide]

The post 2019 Video Developer Report – The Future of Video: AV1 Codec, AI & Machine Learning, and Low Latency appeared first on Bitmovin.

Low Latency Streaming: What is it and How can it be solved?

Christopher Mueller — Fri, 26 Oct 2018 08:18:05 +0000

Latency is a major challenge for the online video industry. This article takes us through what latency is, why it’s important for streaming and how CMAF low latency streaming can help to solve the problems.

Live stream “latency” is the time delay between the transmission of actual live content from the source to when it is received and displayed by the playback device. Or to put it another way, the difference between the moment when the actual event is captured on camera or the live feed comes out of a playout server, and the time when the end user actually sees the content on their device’s screen.
Typical broadcast linear stream delay ranges anywhere from 3-5 seconds whereas online streaming has historically been anywhere from 30 seconds to over 60 seconds depending on the viewing device and the video workflow used.
The challenge for the online streaming industry is to reduce this latency to a range closer to linear broadcast signal latency (3-5 sec) or even lower, depending on the application needs. Therefore, many video providers have taken steps to optimize their live streaming workflows by rolling out new streaming standards like the Common Media Application Format (CMAF) and making changes to encoding, CDN delivery, and playback technologies to close the latency gap and to provide near real-time streaming experience for end-users. This reduced latency for online linear video streaming is commonly referred to as “Low Latency”.

Streaming Latency Continuum

Linear stream/signal latency represents a continuum, as indicated in the diagram above. This diagram illustrates the historic reality of online streaming protocols such as HLS and DASH exhibiting higher latency, and nonadaptive bitrate protocols like RTP/RTSP and WebRTC exhibiting much lower sub-second latency. The discussion here is based on the adaptive bitrate protocols, HLS and MPEG-DASH.

Why is this important for me?

The main goal of Low Latency streaming is to keep playback as close as possible to real-time broadcasts so users can engage and interact with content as it’s unfolding. Typical applications include sports, news, betting, and gaming. Another class of latency-sensitive applications includes feedback data as part of the interactive experience – an example is the ClassPass virtual fitness class, as announced by Bitmovin here.
Other interactive applications include game shows and social engagement. In these use-cases, synchronizing latency across multiple devices becomes valuable for viewers to have a similar chance to answer questions, or provide other interactions.

What is CMAF?

Common Media Application Format (CMAF) was introduced in 2016 and was co-authored by Apple and Microsoft to create a standardized transport container for streaming VoD and linear media using the MPEG-DASH or HLS protocols.
The main goals were:
1) Reduce overhead/encoding and delivery costs through standardized encryption methods
2) simplify complexities associated with video streaming workflows and integrations (ex: DRM, advertising, closed captioning, caching, etc)
3) support a single format that can be used to stream across any online streaming device.
When we originally posted our thoughts on CMAF, adoption was still in its infancy. But, in recent months we have seen increased adoption of CMAF across the video workflow chain and by device manufacturers. As end-user expectations to stream linear content with latency equivalent to traditional broadcast have continued to increase, and content rights to stream real-time have become more and more commonplace, CMAF has stepped in as a viable solution.

What is CMAF Low Latency?

When live streaming, the media (video/audio) is sent in segments that are each a few seconds (2-6 sec) long. This inherently adds a few seconds of delay from transmission to playback as the segments have to be encoded, delivered, downloaded, buffered, and then rendered by the player client, all of which is limited at a minimum by the segment size.

CMAF now comes with a low latency mode where each segment can be split up into smaller units, called “chunks”, greatly reducing latency.

CMAF now comes with a low latency mode where each segment can be split up into smaller units, called “chunks” where each chunk can be 500 milliseconds or lower depending on encoder configurations. With low latency CMAF or chunked CMAF, the player can now request incomplete segments and get all available chunks to render instead of waiting for the full segment to become available, thereby cutting latency down significantly.

CMAF Chunks for low latency

As shown in the diagram above, a “chunk” is the smallest referenceable media unit, by definition, containing a “moof” and “mdat” atom. The mdat holds a single IDR (Instantaneous Decoder Refresh) frame, which is required to begin every “segment”. A “segment” is a collection of one or more “fragments”, and a “fragment” is a collection of one or more chunks. The “moof” box as shown in the diagram, is required by the player for decoding and rendering individual chunks.
At the transmit end of the chain, encoders can output each chunk for delivery immediately after encoding it, and the player can reference and decode each one separately.

What are we doing to solve the latency problem?

The Bitmovin Player has supported CMAF playback for a while now. Recently, we also added support for CMAF low latency playback for HTML5 (web) and native apps (mobile) platforms. The Bitmovin Player can be configured to turn on low latency mode which then enables the player to allow chunk-based decoding and rendering without having to wait for the full segment to be downloaded.
The Bitmovin Player optimizes start-up logic, determines buffer sizes, and adjusts playback rate to achieve near to real live streaming latency. From our testing, this can go as low as 1.8 seconds while maintaining stream stability and good video quality.
CMAF low latency is compatible with the rest of the features that Bitmovin Player already supports today. (Ex: ads, DRM, analytics, closed captioning).

Standard vs Chunked Segmented Streams

In the diagram shown above, player buffering and decoding behavior is shown, contrasting the standard segment (standard latency) mode with the chunked segment mode, corresponding to low latency streaming.
The diagram shows that in non-chunked segments, with a segment size of 4xC (where C is the size of the lowest granularity unit, the chunk, measured in milliseconds) and three-segment buffering, a 14xC-second player latency is typically achieved.
In contrast, chunked segments with CMAF are shown to achieve a 2xC second latency as opposed to a 14xC-second latency, thereby achieving a 7 times improvement in latency.

Are there any trade-offs?

In short, yes. There are some considerations, and some tradeoffs when trying to achieve low latency while still providing a high-quality viewing experience.
Buffer Size: Ideally, we want to render frames as soon as the player receives them. This means we have to maintain a really small buffer size. But, this also introduces instability in the viewing experience especially when the player encounters any unexpected interruptions (like dropped frames or frame bursts) due to network or encoder issues. Without enough locally stored frames, the player stalls or freezes until the buffer refreshes with new frames. This in turn requires the player to re-synch its presentation timing and leads to perceived distortions in the playback experience. Therefore, it’s recommended to maintain at least a 1-second buffer to allow the player to provide a smoother playback experience for viewers that can withstand some network disruptions.
DRM is another factor that might introduce additional delay in start-up time, the license delivery turnaround time will block content playback even though low latency is turned on. In this case, the player adjusts to the latest live frame upon successful license delivery, and the latency is consistent with the set low latency value.

How can I monitor these tradeoffs?

For all of the above reasons, balancing a robust, scalable online streaming platform with minimal re-buffering and stream interruptions against the time-sensitive behavior of low latency CMAF streaming can be challenging. The solution is a holistic view of the streaming experience, provided by Bitmovin Analytics.
Bitmovin Analytics provides insights into session quality so customers can monitor the performance of low latency streaming sessions and make real-time decisions to adjust player and encoding configurations to improve the experience. Bitmovin offers all existing video quality metrics (e.g. Startup time, Buffer Rate) and a few additional metrics to specifically monitor low latency streaming at a content level, such as:

Target Latency
Observed Latency
Playback Rate
Dropped Frames
Bandwidth Used

Besides the player, what else causes latency?

Chunked CMAF streams and low latency-enabled players are key elements in reducing latency in online streaming. However, there are other components in the video delivery chain that introduce latency at each step that need to be considered for further optimization:

Encoder: The encoder needs to be able to ingest live streams as quickly as possible with the encoding configuration optimized to produce the right size of chunks and segments that can then be uploaded to the Origin Server for delivery.
First Mile Upload: The upload time depends on the connection type at the upload facility (wired, wireless) and affects overall latency.
CDN: The CDN technologies need to allow for chunk-based transfers and to adopt the right caching strategies to propagate chunks across the different delivery nodes in a time-sensitive fashion.
Last Mile: The end user’s network conditions also influence overall latency i.e. if the user is on a wired or WiFi or cellular connection. It also depends on how close the user is to the CDN edge.
Playback: As discussed earlier, the player needs to optimize start behavior and balance buffering and playback rate to enable quick download and rendering to always be as close as possible to live time.

These steps are shown below in the end-to-end video flow diagram.

Chunked encoding flow

With chunked segments, from our testing, we’ve seen end-to-end latency as low as 1.8 seconds. However, the customer needs to consider their entire workflow set up to ensure latency is optimized along the full chain to achieve the lowest latency achievable with their specific workflow and network.

In conclusion …

As viewers migrate from a large screen TV by appointment experience to a time-shifted, place-shifted experience with multi-device online streaming, content producers and rights holders have responded by getting more premium content available online, as well as brand new classes of media experiences online involving interactivity and an emphasis on low latency delivery and playback.
The Bitmovin low latency solution was shown here to consist of the Bitmovin Player and Bitmovin Analytics products working together to balance the needs of low latency live streaming on multi-devices while providing the level of insights needed to proactively determine the viewers’ quality of experience, and to take action in case undesired consequences appear as a result of low latency streaming.

Video technology guides and articles

Back to Basics: Guide to the HTML5 Video Tag
What is a VoD Platform?A comprehensive guide to Video on Demand (VOD)
Video Technology [2022]: Top 5 video technology trends
HEVC vs VP9: Modern codecs comparison
What is the AV1 Codec?
Video Compression: Encoding Definition and Adaptive Bitrate
What is adaptive bitrate streaming
MP4 vs MKV: Battle of the Video Formats
AVOD vs SVOD; the “fall” of SVOD and Rise of AVOD & TVOD (Video Tech Trends)
MPEG-DASH (Dynamic Adaptive Streaming over HTTP)
Container Formats: The 4 most common container formats and why they matter to you.
Quality of Experience (QoE) in Video Technology [2022 Guide]

The post Low Latency Streaming: What is it and How can it be solved? appeared first on Bitmovin.

low-latency – Bitmovin

Low Latency vs. Target Latency: Why there isn’t always a need for speed

Table of Contents

Back to basics – What are Low and Target Latency, and how are they achieved?

How do both affect the viewer experience?

What does this mean for a business’s bottom line?

Wrapping up

Video Tech Deep-Dive: Live Low Latency Streaming Part 3 – Low-Latency HLS

Why is latency high in HLS?

Road to Low-Latency HLS

Apple’s Low-Latency HLS

Generation of Partial Media Segments

Preload hints and blocking of Media downloads

Playlist Delta Updates

Blocking of Playlist reload

Rendition Reports

Conclusion

Video Tech Deep-Dive: Live Low Latency Streaming Part 2

Basics of MPEG-DASH Live Streaming

Low Latency Streaming with MPEG-DASH

Wallclock Time Mapping

Client Time Synchronization

Low Latency Service Description

Resynchronization Points

2019 Video Developer Report – The Future of Video: AV1 Codec, AI & Machine Learning, and Low Latency

Download now: The 2019 Video Developer Report reveals the immense growth opportunities today’s developers must evaluate and some of the major challenges they face.

Low latency and device playback are still major concerns for video developers

Codecs: The few dominate the many – H.264/AVC remains atop, AAC (audio) controls the market

Artificial Intelligence & Machine Learning are here to stay

Video technology guides and articles

Low Latency Streaming: What is it and How can it be solved?

Why is this important for me?

What is CMAF?

What is CMAF Low Latency?

What are we doing to solve the latency problem?

Are there any trade-offs?

How can I monitor these tradeoffs?

Besides the player, what else causes latency?

In conclusion …

Video technology guides and articles