Skip to main content

CGI Scenes Dataset

Overview

The QoEVAVE CGI Scene Database is a repository of three high-quality audiovisual scenes: The Cave, Cinema, and Mansion. Unlike the 360 Scenes Database, which features three degrees-of-freedom video, the CGI scenes are designed for six degrees-of-freedom VR with interactive and task-based elements. Each scene includes interactive audiovisual objects, static and triggered audio sources, and fully modelled acoustic geometry for advanced audio rendering. For more information, see the publication.

The Cave

Rendered view of the Cave scene
  • Dark scene with an interactive lantern as the primary light source.
  • Walkie-talkie way-finding task: locate source at 1 of 5 positions.
  • Complex reverberant acoustic geometry with labyrinth-style corridors.
Cave details

The Cinema

Rendered view of the Cinema scene
  • Modelled after the Fraunhofer IIS cinema — 70 seats (10×7).
  • Import custom audiovisual content via the cinema manager.
  • Human avatars, interactable props, and a toggleable lighting button.
Cinema details

The Mansion

Rendered view of the Mansion scene
  • Multi-room mansion with balcony level and diverse surface materials.
  • Impact-sound interaction across multiple object types and materials.
  • Audio localization task: triggered cues fire only when the source is out of view.
Mansion details

Visual Design

Future-Proofing with HDRP

The Unity scenes are developed using Unity's High Definition Render Pipeline (HDRP). This pipeline offers several high-fidelity rendering features such as ray-tracing, lighting volumetrics, and post-processing effects. However, rendering visuals for real-time virtual reality is still computationally expensive, and implementing even a subset of these effects can cause severe frame drops.

A key benefit of HDRP is future-proofing the project. As VR headset capabilities and GPU rendering performance continue to improve, more computationally intensive effects become viable. Volumetric lighting — currently implemented in the Mansion and Cave scenes — demonstrates some of the HDRP features already in use.

Optimization

Texture atlas combining ambient occlusion, normals, and roughness maps into a single material composite

Visual design

Texture Atlas Material

To reduce the number of draw calls, textures across the scenes are combined into a texture atlas. Rather than maintaining individual textures per object, all static-object textures are packed into a single atlas. The figure shows an example composite combining ambient occlusion, normals, and roughness maps into a single material.

Three LOD levels of a wall lamp model in the Mansion scene, showing increasing mesh detail from left to right

Visual design

Levels of Detail (LODs)

Many models in the CGI Scene Database include three LOD levels to maintain rendering performance. The figure shows the wall lamps from the Mansion scene across three levels of mesh detail (left to right: lowest to highest). As the camera moves toward or away from a model, the appropriate LOD is swapped in to balance visual fidelity with draw speed.

Audio Implementation

Audio

For more information on audio rendering and a complete asset catalogue for all scenes, take a look at the Asset Information page.

Object-based Audio

The audio implementation uses an object-based workflow. To render the audio objects, the MetaXR Audio SDK is included in the project. This can be replaced with other Unity audio spatializers if desired.

Each scene features two types of audio playback:

  • Event-based playback: Triggered by a user interaction or scripted event. Represents repeatable occurrences such as impact sounds or doors opening and closing.
  • Continuous playback: Begins at scene start and runs for the duration of the scene. Examples include environmental ambience or a radio broadcast — sounds not strictly controlled by user interaction. A user may interact to mute such a source, but the underlying audio continues to play as though it were a live feed.

Acoustic Diversity

Mean opinion scores for rendering relevance across acoustic attributes for Cave, Cinema, and Mansion scenes

Acoustic design

Rendering Relevance Ratings

A large portion of the scene design was driven by acoustic diversity. The choice of Cave, Cinema, and Mansion provides contrasting acoustic environments in terms of scene tasks, audio stimuli, and the acoustic properties relevant to auralization. To gain initial impressions, rendering relevance was rated for a set of acoustic attributes across each scene (N = 5 audio experts). See the publication for full attribute descriptions.

Bespoke Acoustic Geometry

Each CGI scene includes bespoke acoustic meshes modelled in Blender. These meshes enable more accurate rendering of acoustic features — including occlusion, diffraction, and early reflections — matched to the visual geometry, as opposed to simplified shoebox-style rooms. Scene-specific geometry and acoustic requirements are described on each scene page and on the Asset Information page.

Requirements

PC Hardware and Unity Version

The CGI Scene Database was developed using Unity v2021.3 LTS (long-term support). Scenes may also be imported into newer Unity versions, but may require resolving compilation errors on import.

Scene performance was tested with the following PC specifications:

  • OS: Windows 10 (64-bit)
  • CPU: AMD Ryzen 7 5800X 8-core @ 3.80 GHz
  • RAM: 32 GB
  • GPU: NVIDIA GeForce RTX 3080

VR Input Compatibility

The Unity project ships with the Unity XR Interaction Toolkit, XR Plugin Management, and the Action-based Input System. Both the OpenXR and Oculus XR plugins are provided, making the project compatible with most modern VR headsets.

Version Download

v1.0.0

The Unity project download includes all three scenes along with the required SDKs for VR mechanics, the MetaXR Audio SDK, and XR Interaction Toolkit. Tested with Unity 2021.3 LTS.

Publication

When using the QoEVAVE CGI Scene Database, please cite the following work

@inproceedings{robotham2024,
title = {CGI Scenes for Interactive Audio Research and Development: Cave, Cinema, and Mansion},
author = {Robotham, Thomas and Rebmann, Daniela and Fintineanu-Anghelescu, Dominik O. and Raake, Alexander and Habets, Emanuël A. P.},
year = {2024},
booktitle = {6th AES International Conference on Audio for Games},
address={Tokyo, Japan},
pages={1--11}
}