Hive.js: Browser-Based Distributed Caching for Adaptive Video Streaming


Hive.js: Browser-Based Distributed Caching for Adaptive Video Streaming

Hive.JS, A Browser-Based Distributed Video Caching For Adaptive Video Streaming

Peer to Peer Technology Becomes The Natural Complement To Standard CDN Video Streaming.

Authors: Roberto Roverso and Mikael Hoegqvist

INTRODUCTION
As of December 2013, video streaming constitutes 29% of Internet traffic worldwide and it is expected to grow four times in volume in the next 3 years. Given these growth figures, it is unclear if current Content Delivery Network (CDN) technology will be able to efficiently support such a large amount of traffic in a cost-effective manner.

Peer-to-peer (P2P) technology has been proposed as a complement to standard CDN infrastructure to greatly reduce distribution costs and improve quality of user experience for viewers. In an industrial setting, we already find instances
of such P2P platforms in Akamai’s NetSession and Adobe RTMFP platform, a part of the Adobe Flash plugin.

However, up until now, adoption of P2P technology for content distribution has been hindered by the fact that available platforms require the installation of client-side software, as in NetSession, or a browser plugin, as in RTMFP.

Fortunately, with the advent of the WebRTC standard, this limitation is bound to disappear. WebRTC defines a set of APIs, to be included in all browsers, that are designed to facilitate the development of browser-based real-time P2P
applications. WebRTC therefore completely removes the need for third-party plugins or client-side software and allows frictionless deployment of new P2P-based services as Javascript applications. In order to achieve that goal, functionalities necessary to real-time applications, such as NAT Traversal and
real-time transport protocols, have been built directly into the browser. Currently, stable WebRTC implementations exists in three major browsers, that is Chrome, Firefox and Opera.

In this paper, we evaluate the use of WebRTC for video distribution (update: See the Hive WebRTC video distribution solution). We do so by developing a browser-based distributed caching application, called Hive.js, that leverages a P2P overlay network created with WebRTC to significantly reduce the load on traditional CDNs. We implement our solution considering the realities of the streaming market by adopting adaptive HTTP streaming and, specifically, MPEGDASH as streaming protocol of choice. DASH is already widely embraced by providers, such as YouTube, and it is expected to become the de-facto standard for the distribution of video content on the Internet.

Hive.js is the plugin-less version of the commercial P2P streaming application Hive Streaming. Initial results obtained in a controlled yet realistic environment show that Hive.js, albeit still in a prototype phase, is able to significantly offload traditional CDN while maintaining a good level of quality of user experience. The current version of Hive.js is open source and can be found here.

Download white paper

BACKGROUND AND RELATED WORK
The WebRTC standard mandates the multimedia and peer-to-peer connectivity stacks for real-time services, once provided by external plugins, to be built directly into the browser. The functionality of those stacks, such as video acquisition or noise cancelling, are made available to developers through
HTML5 APIs. The most prominent functionality for our use case is NAT traversal with the inclusion of libjingle, which implements the ICE, STUN and TURN protocols. As a consequence, when building a WebRTC application, all complex NAT traversal operations are entrusted to the browser itself and hidden from the developer. However, as a design choice, WebRTC does not provide a signaling protocol for discovering and managing nodes in the peer-to-peer network,
but rather assumes that implementers either make use of preexisting protocols (SIP, Jingle, etc.) or provide their own. Another important feature recently added to the standard is the DataChannel abstraction that allows the transfer of arbitrary data. Earlier WebRTC communication was limited to video conferencing or VoIP traffic. DataChannel allow the implementation of generic P2P applications on top of WebRTC, such as peer-to-peer streaming. The DataChannel APIs make use of the SCTP protocol. SCTP is a message-oriented protocol that provides configurable reliability of transfers and TCP friendliness.

Besides WebRTC, in our platform, we make use of a Javascript-based player for DASH. DASH defines a protocol for the delivery of video and audio streams over the Internet using HTTP as transport. It also makes use of the adaptive bitrate mode of operation which adapts to the amount of bandwidth and computational power available at the host in order to support different bandwidth scenarios and types of devices, such as phones, tablets and PCs. DASH is an effort to
superseed current proprietary HTTP-based streaming protocols such as HLS, Smooth Streaming and HDS that are based on the concept of HTTP streaming and adaptive bitrate switching, but are not interoperable.

DASH is based on a pull-model, where the player requests content chunks or fragments, from the CDN using HTTP at the pace it deems suitable. The same content fragment exists in many versions, that is one for each bitrate. The choice of which bitrate to retrieve and play is left to the player. The bitrates available, the duration of the fragments and other playback specific
information is provided in a metadata descriptor called the manifest, which is passed as input to the player upon the start of the playback. The DASH Forum provides a reference implementation of a player for DASH that is implemented in Javascript over HTML5 APIs. The player makes use of the W3C Media Source Extensions to allow client-side scripts to parse manifests, request fragments and apply bitrate switching logic while leaving the tasks of content decoding and rendering to the browser.

In order to stream DASH-generated content over a peer-to-peer overlay created with WebRTC, we employ a Distributed Caching approach. Each fragment retrieved by a peer is stored and made available for later retrieval by its neighbours. In this way, when a player requests a fragment, peers avoid getting fragments that are available at their neighbours from the source, but they rather resort to P2P. This significantly decreases the load on the CDN infrastructure, and therefore the cost of distribution. Previous literature on the subject has shown that this method is particularly effective for the task of distributing adaptive HTTP streaming content at large scale. In the same work, it has also been shown that classical P2P streaming approaches, such as “Inside
the New Coolstreaming: Principles, Measurements and Performance
Implications”, are not applicable for adaptive bitrate HTTP streaming.

Concerning browser-based P2P content distribution, we find very little related work. Few address the more general problem of optimising the distribution of generic web content rather than video. In Maygh, authors present a system, which uses both WebRTC and the RTMFP framework for peer-to-peer operations. The authors show that Maygh can offload up to 75% of traffic from the CDN, considering a standard web-portal scenario. Two commercial offerings PeerCDN and Swarmify try to achieve the same goal using only WebRTC. Although relevant to content distribution, the aforementioned systems are not specifically designed for streaming and it is unclear whether they would yield any particular advantage in that use case.

Regarding browser-based P2P streaming, to the best of our knowledge, the only related work is “P2P media streaming with HTML5 and WebRTC (Demo)”. The paper briefly outlines the architecture of a live streaming platform based on WebRTC which employs a structured overlay network (DHT) for locating video fragments. However, no implementation details or evaluation is provided. Consequently, to the best of our knowledge, we are the first to develop and evaluate a fully functional browser-based peer-to-peer platform that does not require any plugin and is specifically designed for video distribution.

ARCHITECTURE
The design of our system is based on a Distributed Caching model that works as follows: upon a content fragment request from the DASH player, the distributed cache tries to timely retrieve the data requested from other peers in the overlay. If a fragment cannot be retrieved from any other peer on time, the agent downloads it from the source of the stream, i.e. the CDN. By falling back to the source, we guarantee that all fragments are delivered to the player even if the overlay network cannot retrieve such fragments, thereby guaranteeing
the desired level of quality of user experience. We evaluate the QoE by measuring the buffer size reported by the (DASH) player. The rationale is that, if a player manages to maintain a sufficiently large buffer by retrieving data in time, either from P2P or the CDN, the playback is less likely to be disrupted due to network issues.

Regarding the P2P overlay, we create a random graph between all peers watching a stream. Each node selects a fixed number of neighbours and connects to them. Periodically, the peer randomly selects one of its neighbours and terminates the connection to it. After that, it connects to a new peer chosen at random from a list of potential neighbours that is provided by a central tracker. In Figure 1, we show the architecture of our application. We make use of the reference DASH player implementation combined with an implementation of our caching abstraction, the Hive.js layer in the figure. Upon the load of the web
page, the DASH player is instructed to load a stream. After that, the Discovery component initiates a connection to a well known tracker address using the WebSocket Client APIs. Once a successful connection is established, the nodes asks for a list of nodes to connect to. The tracker returns a random subset from all peers watching the same stream.

At this point, the peer attempts to establish a WebRTC DataChannel connection with the returned peers. The phases of the connection process are the following: i) the peer relays a request for connection to a chosen peer through the tracker together with platform-specific characteristics of its host, ii)
both peers execute the NAT traversal process using the ICE protocol to achieve direct connectivity between the hosts, iii) the peers create an encrypted channel between the two parties. If all phases are successful, the chosen peer becomes a neighbour, otherwise, if WebRTC fails to achieve direct connectivity between the two peers the connection is dropped.

Once the connection is established, the Peer component stores information about the neighbour in a dedicated Channel instance. The Channel module interacts with the WebRTC DataChannel API to send and receive data from/to the corresponding peer. We choose a configuration of the DataChannel that provides reliable message-oriented transfer of data over the SCTP protocol. However, the implementation of SCTP in the browsers does not allow the transmission of arbitrarily long messages. Since video fragments can be fairly large, in the order of megabytes, we implement a transport protocol which transfers fragments in smaller chunks of fixed size. We append to each chunk a hash of the content calculated on the fly. This is used to check the integrity of the information
once it reaches the destination. If the check is successful, an acknowledgement is sent to the sender, which in turn sends the next chunk of fragment data, if any. If no chunk is left, the transfer is terminated and the sender notified. The protocol is implemented in the sub-module P2P transport.

The transfer of a fragment is triggered by a request from the DASH player. The player is designed in a modular fashion, consequently each module can be easily replaced with a new implementation. In order to provide our distributed
caching mechanism, we provide an alternative implementation of the Fragment Loader module. We instrument the module to forward fragment requests to the Peer component if a fragment is present at the Peer’s neighbours. If the fragment is not available, the content is requested to the source of the
stream with a HTTP GET request through the XHR interface provided by the browser, as it would happen in the standard implementation of the Fragment Loader.

Upon a successful retrieval of a fragment, either from the source or the overlay, a peer stores the fragment data in memory using the Cache module and then it sends a Have notification to all its neighbours. On the receiving end, a peer inserts the information about which fragment is available at which neighbour in the Index component. For the actual retrieval of the fragment from the overlay, we follow a pull-based approach. Once the player issues a fragment request, the peer checks if it is present in the Index component. If so, the Peer module chooses a neighbour from all possible peers uniformly at random and sends a fragment request to it. If the fragment is retrieved successfully, the Peer component delivers the data to the DASH player. Otherwise, if the transfer was interrupted or it timed out, the fragment is retrieved from the source of the content. By falling back to the source, we guarantee that all fragments are delivered to the player even if the overlay network cannot retrieve such fragments, thereby preventing interruptions to the playback and loss of QoE.

EVALUATION
In this evaluation, we provide validation of the correct functioning of our platform and test the limits of the same. In our experiments, each node executes a headless instance of Google Chrome 33.0.1750.149 on Linux using XFVB as output interface. Each instance runs on a dedicated Microsoft Azure dual-core Virtual Machine with 3GB memory and 200Mbps of available bandwidth capacity. Upon the joining of a node, Chrome is instructed to load a static test page that consists of the DASH player and Hive.js, as explained previously. We use version 1.1.2 of the DASH reference player, which was the latest version of the player at the time of writing. We let the player watch a production video stream provided by Envivo1 in a single bitrate configuration of 2Mbps for video, with 4 second segments, and 56 kbps for audio. As in all adaptive HTTP streaming players, a buffer is kept by the player to compensate for temporary disruptions in the delivery of the stream caused by, for instance, network congestion and temporary loss of connectivity. For the chosen stream, the DASH player keeps a buffer of up to 10 seconds.

Our testbed also includes a central Tracker, implemented in Python, which serves both as a discovery facility, which the peers periodically query to obtain a list of potential neighbours, and as broker for WebRTC connection setups. In our experiments, we observe two main metrics: the player’s average buffer size, which is an indicator of QoE as described in Section III, and the amount of savings towards the source of the stream.

The savings metric is a measure of the performance of the systems as a whole and it indicates how much Hive.js managed to offload the CDN. Savings are calculated as the total amount of data retrieved from P2P during an experiment over the total amount of transfered data.

In our experiments, we use two different scenarios. The first one, which we call seeder-only, has the purpose of showing the maximum upload contribution that a peer can provide to the overlay. It is constructed in the following way: a peer (seeder) plays the entire video. After it finishes, the other peers
(leechers) join uniformly at random in a 60 seconds window and start watching. Upon start, leechers receive notifications (Haves) from the seeder indicating which fragments the seeder has downloaded. However, in this experiment, we instruct leechers not to issue Haves themselves. As a consequence, there is no inter-leecher P2P traffic.

In the second scenario, which we call the swarm scenario, we aim at analyzing the performance of the platform as a whole. For that, we remove the seeder and we let all peers behave normally, thus allowing peer-to-peer communication
between leechers. Again, the peers join uniformly at random in a 60 seconds window. In both scenarios, peers watch a video lasting 4.2 minutes. Each experiment is repeated 3 times and we show max, min and average. a) Seeder-only: Figure 2 a) shows the amount of data (bytes) transferred by all peers in the seeder-only scenario from the seeder (P2P) and the source (SRC) with increasing number of nodes. The amount of data transferred for a single peer is fixed and it is the size of the video (60MB).We observe that for more than 20 peers, the leechers cannot retrieve data from the seeder fast enough. Thus, they start to compensate the poor throughput of the seeder by downloading from the source. We also confirm that the total amount of transferred data grows linearly with the number of viewers. Figure 2 b) shows the same trend, where the savings start to drop after 20 leechers.

In Figure 2 c), we show the average buffer size across all peers with increasing number of peers. There we observe that the average buffer size decreases from 10 seconds, the desired amount by the player, to 5 seconds. The lower amount of buffer might render the leechers more exposed to disruptions in the playback as they might not be able to amortize temporary congestion in the network. From this, we derive that a single seeder cannot serve more than 10 peers without impacting the buffer size at the player.

b) Swarm scenario: In comparison to the seeder-only scenario, in the swarm scenario the average buffer size remains stable over time, as depicted in Figure 2 c). This is because, when peers can retrieve data from each other, the load is balanced over all nodes, effectively avoiding a single seeder to become overloaded. As peers join they also contribute to the overall capacity of the system. This effect is also observed in the savings, which increase logarithmically with the number of viewers (Figure 2 b)). Finally, from Figure 2 c) we can see that in the swarm scenario the average buffer size remains
close to 10 seconds, which is the desired amount of buffer by the player.

CONCLUSION
We have presented Hive.js, a browser-based plugin-less video distribution solution layered over the WebRTC peer-to- peer APIs. Our platform utilizes a distributed caching approach that is designed to transport adaptive HTTP streaming content, and specifically MPEG-DASH. Initial results obtained by evaluating Hive.js in a controlled environment show that our solution is viable and does not sacrifice quality of user experience while producing significant savings towards the source of the stream, that is the CDN.

As future work, we plan to continue investigating this approach in more realistic scenarios with large number of nodes and heterogeneous network environments.

ACKNOWLEDGEMENTS

This work was partially supported by the European Union, specifically by the FP7 EU project iSocial (FP7-ITN-316808).