Skip to main content

360 Scene Dataset

Overview

The QoEVAVE Scenes Database provides an initial audiovisual database consisting of 12 scenes capturing real-life nature and urban scenes. The maximum video resolution is 7680x3840 (8k) at 60 frames-per-second, with 4th-order Ambisonics spatial audio (4OA). All video sequences are recorded with a target duration of 60 seconds and designed to represent real-life settings for systematically evaluating various dimensions of uni-/multimodal perception, cognition, behavior, and quality of experience (QoE) in a controlled virtual environment. This database serves as high-quality reference material with an equal focus on auditory and visual sensory information within the QoE community. For more information, please see the publication below.

Recorded Scenes

You can download individual scenes on each of the available scene pages, or view the download list list below. On each scene page, you will find the following information:

  • Location information of recording.
  • Scene version notes describing variations in multiple 'takes'.
  • Preview link to YouTube (uses 1st-order Ambisonics).
  • Download links for 4th-order Ambisonics audio, 8k video file, or muxed audiovideo file with 1st-order Ambisonics.
  • Spatial / Temporal indexing plots on video data.

Capture

MediaDeviceDescription
Audiomhacoustics Eigenmike32 channel spherical microphone array capable of 4th order higher-order ambisonics output.
VideoInsta360 Pro 2360 video camera capable of 8K video output. Comprised of 6 F2.4 fisheye lenses each capturing 4K video resolution up to 120Mbps.

Specifications

InformationDescription
Video Encodingffvhuff
Video Resolution7680x3840
Projection MapEquirectangular
Video FPS59.94
Audio Sample-rate48,000; 24-bit PCM
4th-Order AmbisonicsIndividual .wav files of 4th-order Ambisonics in ACN channel ordering with SN3D normalization. Scene representation has a -90° rotational offset against video files for playback with Unity. Visit Help for more information. Labelled as ambiX4 in file names.
1st-Order Ambisonics1st-Order Ambisonics in ACN channel ordering with SN3D normalization. Encoded using AAC and muxed with the video files into an .MP4 container. Pre-processed with Google's spatial media metadata injector for uploading to YouTube. Labelled as ambiX1 in file names.

Audio post-production — all files are processed with IEM VST plugins:

  • 500 ms fade-in and fade-out
  • 60 Hz high-pass filter
  • 10 kHz −3 dB notch filter
  • Omnidirectional compression for make-up gain (max +5 dB)
  • Ambisonic B-Format rotation for audiovisual alignment

Download

Expand the list below to download audio, video, or muxed files for all scenes at once. For scene-specific notes and version history, visit the individual scene pages.

Download list

File Naming Conventions

Download filenames follow a consistent pattern encoding the key properties of each file.

FormatPattern
4OA Audio (.wav)SceneName+Version · A · AmbisonicsFormat+Order · Bit-depth
8K Video (.mkv)SceneName+Version · V · Resolution · FPS · Duration
Muxed AV (.mp4)SceneName+Version · AV · Resolution · FPS · Duration · Bitrate · AmbisonicsFormat+Order · Bit-depth

Example: Badminton01_A_ambiX4_24bit.wav — Scene Badminton, version 01, audio-only (A), 4th-order AmbiX format (ambiX4), 24-bit depth.

Publication

When using the QoEVAVE Scenes Database, please cite the following works

@inproceedings{robotham2022,
title = {Audiovisual Database with 360° Video and Higher-Order Ambisonics Audio for Perception, Cognition, Behavior, and QoE Evaluation Research},
author = {Robotham, Thomas and Singla, Ashutosh and Rummukainen, Olli S. and Raake, Alexander and Habets, Emanuël A. P.},
year = {2022},
booktitle = {14th International Conference on Quality of Multimedia Experience},
address={Lippstadt, Germany},
pages={1--6},
doi={10.1109/QoMEX55416.2022.9900893}
}