Home Results Measurement Measurement-based characterisation of application and user behaviour
Measurement-based characterisation of application and user behaviour PDF Print E-mail

This deliverable presents our work on the characterisation of the data collected for the two main applications envisioned for NANODATACENTERS: Video-on-Demand applications and multiplayer games.

Video on Demand

Two aspects greatly impact the NANODATACENTERS architecture with respect to Video-on-Demand: (i) the distribution of content popularity, and (ii) its dynamics over time. The extent to which popularity distributions are skewed, or dominated by a few highly popular applications, directly influences the ability to cache content, and hence the memory requirements for efficient video delivery. Moreover, the dynamics of popularity affect the extent to which the system has enough time to adjust to changes in this popularity. Another aspect concerns the profiles of users in the case where users might be categorised according to very different profiles. Identifying and managing each user’s dedicated storage for NANODATACENTERS in accordance with their profile can improve the performance of the NANODATACENTERS approach. So far, three types of content have been identified for the applications envisioned in the NANODATACENTERS gateway: User Generated Content (UGC), Catch-up TV and traditional pay-per-view Video-on-Demand. Our findings, and the implications for NANODATACENTERS of each of the three above stated types of content,
are addressed below.

 

User Generated Content

 


UGC websites such as YouTube are re-shaping the way that people generate and watch video and TV, with millions of video producers and consumers. In particular, UGC sites are creating new viewing patterns and social interactions, empowering users to be more creative, and developing new business opportunities.


To better understand the impact of UGC systems, YouTube (the world’s largest UGC Video-on-Demand system) was analyzed. Based on a large amount of data collected, an in-depth study [1] of YouTube and other similar UGC systems was performed. Attention was focused on popularity distributions and their dynamics. Two important properties that directly impact the NANODATACENTERS architecture were observed: the first being that the popularity distribution is highly skewed, which implies that caching can be made efficiently, and secondly, the popularity distribution varies with a timescale of days, which is sufficiently slow
that cache contents can be adjusted to reflect the changes accordingly.

Our collected YouTube workload has enabled us to run trace-driven simulations and assess the memory requirements of a NANODATACENTERS node for supporting such workloads (these results are further detailed in Deliverable D1.2). As an example, 128GB of hard drive per NANODATACENTERS node suffices to serve a YouTube catalogue under standard assumptions about uplink bandwidth capacity and video encoding. Such dimensioning information is essential in order to assess the economical viability of the NANODATACENTERS model. Given the price of memory, it is certainly feasible to provision each gateway with a 128GB flash drive, and hence to enable NANODATACENTERS to support YouTube-like UGC applications without (almost) any assistance from the traditional datacenter infrastructure.

 

Catch-up TV

 


Whilst Catch-up TV is not yet widely adopted, it is expected to become an important source of multimedia traffic in the coming years. There is currently a lack of characterisation or understanding of Catch-up TV usage properties, but based on the characteristics of current TV viewing habits, it is possible to infer several of its key features. The first measurement analysis of IPTV workloads [2] was conducted based on network traces from one of the world’s largest IPTV systems. The dataset captures the channel change activities of 250,000 households over a six-month period. The properties of viewing sessions, channel popularity dynamics,
geographical locality, and channel switching behaviours were characterised. The workload was characterised in particular according to the overall user viewing time (around 90 minutes per day) and the length of consecutive viewing of given channels. The overall popularity distribution of content episodes was also determined with the resulting popularity distribution markedly skewed, but not as much as that of the YouTube content.


Using this user behaviour characterisation, a synthetic Catch-up TV workload was created, adapted to match the episode lengths, popularities and user daily watching times. This workload is again an essential ingredient in assessing the memory constraints on
NANODATACENTERS nodes required to support such Catch-up TV applications. The tracedriven simulations reported in deliverable D1.2 exploit these workloads, and enable us to conclude that Catch-up TV can also be supported in NANODATACENTERS with under 128GB of flash memory per node.

To evaluate the overall architecture, and particularly the specific adaptive content replication schemes which will be developed in order to manage each type of content, be it UGC, Catchup TV, or other, these synthetic workloads will be made available.
Pay-per-view Video-on-Demand.

The last type of video content is pay-per-view Video-on-Demand. The focus on understanding users’ tastes for content was key for offering targeted recommendations. In the case of NANODATACENTERS, minimising the cost of accessing the content is dvantageous in, deciding the content preloading strategies. A study [3] was also performed on the publicly available Netflix data trace. Besides characterising the popularity distribution (highly skewed), an investigation was made regarding to what extent users behave according to distinct profiles.

Profiling techniques were developed which showed that the user population is far from homogeneous, but behaves according to distinct profiles. The benefits of using the proposed user profiling for informing content preloading strategies were also demonstrated empirically on the Netflix dataset. Specifically, a comparison was made of user profile-unaware content preloading strategies with strategies which systematically preload the most popular content among users with the same profile. These preliminary results show that such profile-informed strategies improve the overall platform efficiency. Additional evaluations with various workloads need to be performed before determining whether such a profiling module is necessary in the NANODATACENTERS architecture.

Multiplayer games

In the case of multiplayer games, the mobility patterns of user avatars determine the system requirements of user-to-user communications, via the frequency of user encounters and departures. These mobility patterns also determine how frequently a user’s view of the virtual world needs to be updated, and thus specifies the constraints on the corresponding supporting
system mechanisms. An extensive measurement study [4] was conducted on a large portion of SecondLife [SL]. SL has received a great deal of press coverage with some major companies and governments even establishing their presence on it. One finding in our measurement study is that avatar mobility is driven by long periods during which avatars meet and remain in a single place, whilst almost 30% of the regions do not attract any visitors. These features make peer-to-peer approaches very promising for supporting SL-like applications, despite the poor scaling behaviour of SL’s current client-server architecture. This poor scalability is evidenced by the number of avatars per region being limited to 100. Even worse, servers experience peaks of high load with as little as 20 avatars, and become clearly overloaded beyond 50 avatars. This degrades user experience and limits social interaction in the virtual world.


The characteristics of the avatar mobility will be used as a preliminary distribution model to calibrate the multiplayer game in the NANODATACENTERS project. In particular, the collected trace of the avatar motion will be used to validate the multiplayer peer-to-peer application in the NANODATACENTERS architecture.

 

More details are available in the report.