Datasets

Dataset description

 

Corresponding publications

(please cite if using the dataset)

Video streaming datasets: Improving the Transfer of Machine Learning-Based Video QoE Estimation Across Diverse Networks

(download here)

The datasets in this repository consist of video on demand streaming data collected at two locations (Würzburg, Germany and Zagreb, Croatia) and across two years (2020 and 2021). We refer to the datasets by using the following labels: Wue_2020, Wue_2021, Zag_2020, Zag_2021. The data includes network traffic features used to estimate Quality of Experience (QoE) and Key Performance Indicators (KPI) of video streaming sessions using machine learning. The traffic features are annotated with QoE/KPI classes, with samples considered both on a session-level (per-video) and in real-time fashion (per-second). The datasets are collected for and presented in the journal article entitled "Improving the Transfer of Machine Learning-Based Video QoE Estimation Across Diverse Networks", authored by Michael Seufert and Irena Oršolić, published in IEEE Transactions on Network and Service Management in 2023

Michael Seufert, Irena Oršolić, "Improving the Transfer of Machine Learning-Based Video QoE Estimation Across Diverse Networks", IEEE Transactions on Network and Service Management, 2023

Questionnaire-based survey investigating the influence of various system-related factors on overall experience and quality perception of audiovisual calls on smartphones (download here).

This dataset contains the results of two surveys: Survey 1 -  272 participants, conducted in February 2020;  Survey 2 - 249 participants, conducted in October 2021. Data was collected on user opinions regarding the influence of various factors related to media quality, functional support of the service, usability, service design, and resource consumption. The focus was on audiovisual calls established in a leisure context, as opposed to business related calls/meetings.

D. Vučić, S. Baraković, L. Skorin-Kapov, Survey on user perceived system factors influencing the QoE of audiovisual calls on smartphones. Multimedia Tools and Applications (2022). https://doi.org/10.1007/s11042-022-14173-4

OTT video streaming dataset containing user interactions (download here)

The dataset contains network traffic statistics and ground-truth application-layer data corresponding to 7424 video sessions from the global OTT video streaming provider. Data was collected over a four-month period, from December 2020 to March 2021. In total, 2030 videos had no user interactions executed during playback, 1749 videos were paused once during playback, 1808 videos were seeked forward at one point during playback, and 1837 videos were terminated before the video ended.

I. Bartolec, I. Orsolic, and L. Skorin-Kapov. "Impact of User Playback Interactions on In-Network Estimation of Video Streaming Performance", IEEE Transactions on Network and Service Management, 2022

CGD: A Cloud Gaming Dataset with Gameplay Video and Network Recordings (download here)

CGD, a dataset consisting of 600 game streaming sessions corresponding to 10 games of different genres being played and streamed using the following encoding parameters: bitrate (5, 10, 20 Mbps), resolution (720p, 1080p), and frame rate (30, 60 fps). For every combination repeated five times for each game, the dataset includes: 1) gameplay video recordings, 2) network traffic traces, 3) user input logs (mouse and keyboard), and 4) streaming performance logs.

I. Slivar, K. Bacic, I. Orsolic, L. Skorin-Kapov, and M. Suznjevic. "CGD: A Cloud Gaming Dataset with Gameplay Video and
Network Recordings",
 In 13th ACM Multimedia Systems Conference (MMSys ’22), Athlone, Ireland, June 14–17, 2022. 

 

FPV dataset - subjective user scores of QoE, graphics quality, fluidity, and willingness to continue using the system (download here)

We have performed a subjective study on using the Orqa FPV.SkyDive drone flight simulator with FPV goggles gathering over 250 responses on various subjective QoE metrics from 14 participants. The dataset is analyzed and described in the corresponding paper.

M. Šilić, M. Sužnjević, and L. Skorin-Kapov. "QoE Assessment of FPV Drone Control in a Cloud Gaming Based Simulation", 13th International Conference on Quality of Multimedia Experience (QoMEX 2021), Montreal, Canada, June 2021.

YouTube QoE/KPI classification with user interactions - network traffic features annotated with MOS/KPIs (request access here)

Six datasets were collected during the last months of 2019, and contain sessions with none, or one manually triggered user interaction (e.g., pause, seek, abandon, playback speed), with one dataset containing a pair, or a combination of said interactions. Each dataset contains YouTube video streaming sessions that were collected on an Android smartphone, and native YouTube application. A total of 17 machine learning-based models for per-video KPI classification were trained on various combinations of those datasets. A detailed list of features, selected features per model and feature importances are provided here.

I. Bartolec, I. Orsolic, L. Skorin-Kapov, "Inclusion of End User Playback-Related Interactions in YouTube Video Data Collection and ML-Based Performance Model Training", in proc. of the 12th International Conference on Quality of Multimedia Experience (QoMEX), June 2020.

YouTube per-video and per-second QoE/KPI classification (request access here)

Two datasets collected in 2019 corresponding to the streaming of 400 YouTube videos on (1) Android and (2) iOS platform. Both datasets include per-video network traffic features, as specified in the corresponding publication, and MOS/KPI labels. Moreover, the Android dataset is available in the format appropriate for real-time (per-second) KPI classification, where 1s intervals are denoted with a set of network traffic features and labelled with KPIs.

I. Orsolic, L. Skorin-Kapov, "A Framework for In-Network QoE Monitoring of Encrypted Video Streaming", IEEE Access, vol. 8, pp. 74691-74706, 2020, DOI: 10.1109/ACCESS.2020.2988735 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 Check our datasets related to other projects on MUEXlab webpage.