facebookpixel

Embracing the Unpredictability of Live Streaming

CMMA Blog

As a blackjack player, even if you’re playing “basic strategy” to maximize your odds for each hand – the house edge, while seemingly small, is significant when enough players play for an extended duration. A player may view their success – or lack thereof — from moment to moment based on a (un)lucky shoe or dealer. While that player might have less than a half percentage disadvantage, that house edge matters.

When we talk about scale, zeros matter. And when we think about live streaming at scale, we need to reframe our metrics. But first, we need to understand the consumption characteristics of live and which factors make it unpredictable in nature and unpredictable in practice.

As an industry, we’ve had many years of experience with video on demand (VOD), delivering billions of hours of premium content to over a billion viewers across a multitude of devices. At the NAB Streaming Summit, one of the panel topics focused on best practices for managing and operating live events: what can companies leverage from delivering VOD, and conversely, what do we need to rethink?

If we were to think about video consumption as cars, VOD behavior is a Toyota Camry. It has decent speed, acceleration, turning, and braking – it’s predictable. For one customer that provides a premium SVOD service for millions of subscribers across dozens of devices, we created a viewer consumption model. Using historical consumption and factoring in recent trend patterns and expectations related to new content drops, marketing, and other factors that could affect usage, we could predict video traffic with a reasonable degree of confidence. This approach was helpful to model costs and infrastructure provisioning when analyzing from day-to-day, month-to-month, and year-over-year.

We could theoretically take this approach and apply it to live. However, we’re no longer dealing with a Toyota Camry.

Live is a Bugatti Chiron. But why? Live is fundamentally a transient experience, an event/moment centric experience. Though linear simulcast of pre-recorded content – e.g., primetime entertainment dramas and comedies – will have more in common with traditional VOD, truly live content – e.g., sports, news – can turn on a dime.

With live streaming, concurrent viewers can grow or shrink by thousands – or tens of thousands – in a matter of minutes. For example:

  • Viewers start watching an Olympic event, but within a minute, the athlete falls and is no longer in contention for a medal. Consequently, viewers are no longer interested in watching the remainder of the event and – with a click, swipe, or a whispered “Alexa…” – turn their digital eyeballs elsewhere.

  • On the other hand, if you’re fan of the Pats or Crimson Tide, you may have given up on them during the previous Super Bowl or CFP title game at halftime. But what happens when you get texts from friends that there’s a hint of a comeback?

  • Or, if you’re a baseball fan and it’s 10-0 in the sixth, you’re probably not interested in the game if you’re not a Giants or ‘Stros fan or if there aren’t any players from your fantasy lineup. But when it’s the ninth inning and news breaks that Matt Cain has a perfect game on the line, the score is meaningless; this is about history.

Given the demand variance of live streaming consumption, there are considerable limitations when relying on legacy data centers and independently provisioned hardware. From both cost and operational perspectives, companies need to utilize a scalable service that can react to the ebb and flow of live streaming. Cloud infrastructure – whether public or private – becomes a requirement.

For public cloud infrastructure, its fundamental value proposition over the years is the cost effective means for scaling resource against demand. But with live, if you correlate infrastructure allocation directly with demand, by the time you react to a spike in audience and the consequent demand for resources, the ability to spin up more instances may take minutes, which is too late. And with the same velocity as increased demand, audiences can shrink just as quickly, leaving provisioned resources underutilized.

There is as much science as there is art as to when to scale – or prescale – resources and when to terminate resources. From an architecture and cost perspective, allocation should take into account resource types (e.g., in the context of AWS, on-demand vs. reserved vs spot), regional resource capabilities (e.g., public cloud infrastructure regions, capacity of a single or multiple CDNs, etc.), and most importantly, the content itself. There are many ways to model resource allocation and address bias-variance tradeoff, but we should not ignore the value of having eyes on glass watching the event and informing the resource allocation model as to the potential shape of the demand traffic. No one could predict Spieth’s 2016 Masters meltdown, but the combination of informed data – via machine learning or similar analysis – and eyes on glass – with a sprinkle of human intuition – is a reasonable compromise.

None of us have a crystal ball, but what we need to do is look at the holistic end-to-end workflow for live and understand the balance between user consumption behavior and the effort and cost to delivery a solution that scales accordingly.

In our next post, we will discuss how Quality of Service (QoS) plays into the need for proper instrumentation and telemetry to understand the overall health of the workflow and how to plan for when things don’t go according to plan.

To view our Partner blog, click here

Live at Scale: Zeros Matter

CMMA Blog

Several years ago, success for a live event was measured as ten thousand or one hundred thousand concurrent viewers. A million plus – the scale of a Super Bowl – was relegated to major broadcasters and once-a-year events. A couple of weeks ago, Hotstar supported over 10 million concurrent viewers during the Vivo IPL final, eclipsing the eight million concurrent viewers that watched Felix Baumgartner fall from the stratosphere.

Companies are allocating massive budgets to create or license the most compelling content for their audience, with the aim to make the content accessible on platforms convenient to users and with commercial terms that are reasonable. With all that effort and money going to the acquisition of content and the growing appetite for live content, the desire to reach seven figures is commonplace, regardless of the actual audience.

Today, the ability to deliver to one million concurrent viewers is neither a business nor a technical impracticality, it’s an expectation. And with scale at this level, zeros matter.

Average. Median. 95th percentile. You can throw those numbers out the window. The edge case is now the priority, as the impact on viewers and the loss of revenue becomes material.

Achieving Redundancy Redundancy

As Alfred Borden in The Prestige, sometimes two is better than one.

With the desire to deliver a great experience to every viewer, redundancy is a common topic. The goal for any company is to ensure that the live viewing experience is on-par with – or better than – broadcast. It’s a high bar, but it’s achievable. But as we know, the internet is inherently based on the assumption of inefficiency and failure, that bits get dropped, misdirected, and corrupted. So, how do we think about redundancy in a world that is inherently prone to error? There’s no easy black-and-white answer; redundancy is a spectrum of gray.

To truly be redundant in a cloud-based architecture, you need to think holistically from glass to glass, from capture to consumption. This means:

  • Multiple cameras
  • Multiple “first mile” routes to upload content
  • Multiple encoding systems
  • If you’re monetizing with ads: multiple ad providers
  • If you’re monetizing via transaction: multiple commerce gateways
  • Multiple content origins
  • Multiple CDN providers
  • Manifest-Level failover (i.e., alternate media in HLS, secondary BaseURL in DASH) supported by compatible players

Not every company has the budget nor the desire to be truly 100% redundant (content, systems, and people). While companies may want to eliminate risk, they need to make decisions based on economic, operational, and practical considerations. Is 100% redundancy even achievable? Well, for those of us that watched the power go out during the Super Bowl XLVII, it was a reminder that we can’t control all the upstream dependencies.

Redefining Quality of Service in the Context of Live

Since redundancy is an issue of degree – instead of certainty – this changes how we should look at live. In the past, the common view was to think of live delivery from the nomenclature of a conventional NOC – network operations center – watching the bits go by, measuring latency, routing, and packets. As live streaming inherently becomes more complex, we need to expand the scope of what we measure and manage. We’re now in a world where we need a BOC – business operations center – to provide actionable intelligence across all the components of the live workflow, including the dependencies on third parties, from ad providers to CDN providers to client-side player behavior. And, there are social networks. It’s no surprise that the loudest voice is often the one complaining about an issue. For many of our customers, they often hear about issues from social media more quickly than their own programmatic notifications.

As a result, measuring quality of service is paramount. And the metrics for quality of service changes in both depth and breadth. Even in today’s terms, quality of service is typically defined from the perspective of the client experience, i.e., the video player, and the impact of delivery from origin to the last mile, and whether those factors result in rebuffering. With many vendors attempting to address this narrow definition of quality of service, the actionable result is to simply “switch CDNs.” 

Quality of service for live has much greater scope than prediction of “last mile” performance.

  • Monitoring the contribution feed and “first-mile” delivery to measure latency or dropped frames that could affect the quality of the content or create drift in timing.
  • Monitoring the transcoding process to ensure content is encoded with predictable and consistent throughput as compute can be affected based on the input and output settings, e.g., bitrate of the input stream, SD vs. HD renditions, video or audio processing (e.g., watermarks, audio channel mapping, caption and subtitle ingestion and/or transformation), 30 vs. 60 fps (or normalization to a specific frame rate), codec (e.g., H.264 vs. HEVC), content protection (e.g., encryption or DRM.)
  • Monitoring origin throughput: Content writes and CDN reads.
  • Due to the vast amount of data, one often overlooked area is monitoring subnet and POP performance of CDN delivery for both hot and cold hits. It’s not uncommon for an errant group of edge servers – or an entire region – to exhibit degraded performance, but this is often only apparent if you’re inspecting it through the lens of 99/99.5/99.9 percentile.
  • Measuring client-side performance remains important to understand if unexpected buffering or errors occur due to device limitations, CDN degradation, delayed manifest refreshes, or failures in DRM license acquisition.
  • If the content is based on any type of authentication or authorization workflow – e.g., TV Everywhere – those third parties should be monitored, especially for appointment-viewing experiences when there could be an influx of viewers. This applies to transactional use cases as well – e.g., PPV sports – where access to content may require handling a large volume of payment processing. And with ad supported models utilizing server-side ad insertion, both ad servers and the third parties receiving ad impressions should be closely monitored.

Just as CDNs have varying capacity and performance from a regional perspective, many of the dependent third parties often utilize public cloud infrastructure, which share similar regional limitations. Only by ensuring measurement takes all internal components and external dependencies into account can we ensure that actionable data is available to ensure a better-than-broadcast experience for every viewer.

To view our Partner blog, click here

ORI TV Increases Subscribers by 10X with OTT Flow

CMMA Blog

How did Mongolia’s first OTT service get to market in a few short months and increase subscribers by tenfold in less than six months? Batka Gankhuyag, vice president of Mongol TV, the leading broadcasting company and content creator in Mongolia, shared the story of how he successfully created ORI TV on the stage at PLAY.  

Watch Batka explain first hand how he and his team were able to get to market quickly and deliver a comprehensive and engaging video experience that streams shows like “The Voice,” “Shark Tank,” and “Mongolia’s Got Talent” to a population of three million people in Mongolia and hundreds of thousands of Mongolian expats around the world. Here are few highlights from his presentation:

  • ORI TV was conceived in large part in response to the rise of mobile as the most important entertainment platform in the region, and the need to deliver content to audiences wherever and whenever they want it.
  • Batka and team were delighted with how “painless” it was to go live with ORI TV in just a few months thanks to Brightcove’s turnkey solution OTT Flow .
  • A content producer at heart, Batka is thrilled that he doesn’t have to get into the technical weeds. In fact, ORI TV does not have any technical staff in the entire organization, and yet, the company was able to successfully launch premium apps across Web, iOS, Android, AppleTV, and Chromecast, leveraging Brightcove’s technical expertise.  

Later in the day, Batka provided deeper insights in the breakout session, “How We Went OTT.” Here he shared ORI TV’s amazing audience growth – a tenfold increase in subscribers since the service launched in January. From his standpoint, content continues to be king – particularly if its niche or original – to attract and retain viewers. He also reiterated that with turnkey solutions like OTT Flow, content owners don’t need to invest in highly skilled IT staff to launch an OTT service and can get to market more quickly than ever which is critical in today’s highly competitive OTT landscape.

To view our Partner blog, click here

PLAY 2018 – Momentum in Media

CMMA Blog

PLAY 2018’s keynote session starred Brightcove customers from around the world who are opening new markets with OTT offerings and live streaming news and sports at increasingly ambitious scale.  Hear their stories as well as the announcements of what products are in the pipeline to allow them and everyone else to drive even better business results with video in the months ahead.

Here’s the status of a few of the innovations covered in the keynote if you’d like to learn more:

To view our Partner blog, click here

5 Common Ad Errors & How to Fix Them

CMMA Blog

One of the most common concerns I hear from publishers is about the mysterious ad error codes that they sometimes receive when trying to serve a video ad. Ad error codes show up as anonymous numbers when ad playback is interrupted, and contain little to no information on causes or possible resolutions.

With some research and experimentation, as well as some help from our friends at WatchingThat , below is a list of the five most frequent video ad errors. While there is rarely just one foolproof solution to resolving each error, I have included advice on possible resolutions for each of the error codes. It is important to keep in mind that not all ad errors are made equal, and not all are necessarily fatal to the ad serving process.

Error 303 – No Ads VAST Response After One or More Wrappers

This error is perhaps the one most frequently experienced by publishers. Error 303 indicates that no valid ad was returned after the player and SDK followed the wrappers in the ad response. This error is somewhat “expected,” it simply indicates that the third party is not able to fill the given ad request. The number of 303 errors should roughly match the expected percentage of your inventory that your ad provider does not expect to fill. In other words, if a third party’s expected fill is 70%, 30% of your requests may return this error. The best way to remedy this situation, outside of speaking to your ad providers about ways to increase demand, would be to create a fallback/passback in DFP. This will ensure that, in the worst case scenario, you will at least have a house ad to fall back to when there is no demand.

Error 301 – Timeout of VAST URI

Error 301 indicates that a URI within the VAST or VPAID creative timed out. This could be due to errors on requests, such as an invalid or unreachable URI, and security exceptions related to HTTP/HTTPS. The most common reason for Error 301 is a poor network connection or heavy latency.

The first step would be to check the URI within the VAST/VPAID creative to make sure it is valid and reachable. This can easily be done by copy and pasting it into the browser address bar and following the redirect. A broken URI would definitely trigger error 301. Another factor is if the ad creative is using a mix of HTTP/HTTPS URLs, or perhaps attempting to serve insecure HTTP creative to a secure HTTPS webpage. The SSL protocols must match.

Finally, determine which platform this error occurs most frequently. If the issue is largely on mobile, make sure your pages are well optimized for mobile traffic. This can be done by reducing the size of the images, lazy loading, caching, and making sure the ad integration is not being prevented from rendering by other page elements.

Error 302 – Wrapper Limit Reached

This error is caused by daisy chaining, or overly complex advertising logic. The IAB recommends that no more than five wrapper redirects be used in an ad response, the IMA3 SDK has a limit of four by default. This can be manually raised if Error 302 is detected, however, we recommend speaking to your ad providers about enforcing the lower number of wrappers, as a first step. A higher number of redirects will make for a poor user experience as well as added latency.

Error 402 – Timeout of Media File URI

Error 402 is caused by the ad creative taking longer to load than your current timeout setting and is directly related to the type or size of media file used within the VAST or VPAID ad unit. The error implies that the file is either too large, the bitrate is too high, or it is incompatible with the platform on which it is attempting to render.

Steps you can take to reduce these errors include increasing the request timeout, optimizing your website to increase load speed, and most importantly, setting bitrate and size restrictions on creatives serving on mobile.

Error 901 – General VPAID Error

This famous (infamous?) error is every publishers’ favorite one. It is notoriously vague and can imply multiple problems related to VPAID. There are multiple possible causes, so it can be difficult to pinpoint the exact steps to address them. Issues may include Flash trying to serve on HTML5, VPAID opt outs, and viewability solution wrappers that do not deliver an MP4 or other linear media file.

When you encounter Error 901, the first thing to check is whether you are attempting to use the VPAID adaptor instead of Brightcove’s IMA3 SDK plugin. When generating the tag in DFP, you can elect to use this adaptor which essentially creates the IMA SDK layer for the purpose of displaying the VPAID or rely on the Brightcove player. Since in the majority of cases you will already be using our IMA3 SDK, you will not need the adaptor option. Trying to play the adaptor tag in the IMA3 plugin will lead to duplication, resulting in Error 901.

This is not the only possible solution. Other things to check include making sure the VPAID is not Flash, inquiring with the ad provider about opt-outs, and verifying that the VPAID tag works in a test environment (such as theGoogle VAST tester ).

————

While this article does not include every possible ad error and resolution that you may encounter, it should give you a leg up on some of the most frequent problems publishers experience. The most important thing to consider is you cannot resolve a problem that you aren’t able to identify in the first place, and that is where tools such as the one provided by Brightcove’s new partner, WatchingThat , can help you.

To view our Partner blog, click here