User-Centered Adaptive Streaming of Dynamic Point Clouds for Virtual Reality Remote Communication

Thesis by PhD student Shishir Subramanyam from the Distributed and Interactive Systems (DIS) group reports on research exploring adaptive delivery optimizations for real-time point cloud user reconstructions.

Publication date
9 Dec 2024

This thesis investigates a user-centered, low-complexity point cloud adaptive streaming method to improve the quality of experience in VR remote communication. By spatially segmenting point clouds and estimating surface orientation in real-time, combined with an auxiliary utility function for bandwidth allocation, the delivery of remote user reconstructions is optimized. The contributions include a novel subjective evaluation methodology for dynamic point clouds in immersive environments and a low-complexity adaptive streaming approach. Results demonstrate significant bitrate savings and improved user experience, highlighting the potential of the method for real-time VR communication.

Improving immersion and co-presence in remote communication

Remote communication applications have become a necessity in a globalised and connected world exemplified by the popularity of video conferencing applications. In recent years, virtual reality (VR) remote communication applications have emerged that aim to deliver a greater sense of co-presence and immersion in a shared virtual space where users are able to navigate freely while employing both verbal and non-verbal communication. Such applications require a volumetric user representation and point clouds have emerged as a popular format to represent real-time user reconstructions. However, volumetric point clouds are challenging to deliver over bandwidth-limited networks owing to the large volume of data required for dynamic streams. This challenge can be addressed by combining real-time compression and user-adaptive streaming. Adaptive streaming is the process of segmenting an object spatially and temporally in order to optimize the delivery of content by prioritizing the quality of spatial segments that are visible from a given viewport.

Figure 1: Point cloud capture setup and user reconstructions
Figure 1: Point cloud capture setup and user reconstructions

Building the VR remote communication pipeline

The thesis "User-Centered Adaptive Streaming of Dynamic Point Clouds for Virtual Reality Remote Communication" by CWI PhD student Shishir Subramanyam from the Distributed and Interactive Systems (DIS) group reports on research exploring adaptive delivery optimizations for real-time point cloud user reconstructions. Previous work in the field mainly focused on using the entire point cloud object as the unit of bandwidth allocation in scenes containing multiple point clouds. Other recent work has relied on computationally complex surface estimation in order to spatially segment the point cloud that is unsuited to real-time applications.

Figure 2: Overview of the VR remote communication pipeline
Figure 2: Overview of the VR remote communication pipeline

During his thesis, Shishir and the DIS team developed a low-complexity adaptive streaming method and followed a user-centered approach to improve the quality of experience. This resulted in optimizing the delivery of a remote user's point cloud reconstruction with spatial segmentation and surface orientation estimation in real-time combined with an auxiliary utility function to allocate the available bandwidth across available segments. The utility is defined based on the position and surface orientation of the point cloud segment and the position and orientation of the user's viewport.

More Information

PhD Defence Shishir Subramanyam: Thursday, 12th December 2024 12:00-14:00

Location: Senate Hall, on the 2nd floor of the Aula Conference Centre, Mekelweg 5, 2628 CD Delft

Link to live stream

Promotor: Prof. dr. P. Cesar

Promotor: Prof. dr. A. Hanjalic

Copromotor: Dr. I. Viola

Relevant Publications

  • S. Subramanyam, I. Viola, J. Jansen, E. Alexiou, A. Hanjalic and P. Cesar. Evaluating the Impact of Tiled User-Adaptive Real-Time Point Cloud Streaming on VR Remote Communication. 2022. ACM International Conference on Multimedia (ACM MM).
  • S. Subramanyam, I. Viola, J. Jansen, E. Alexiou, A. Hanjalic and P. Cesar. 2022 Subjective QoE Evaluation of User-Centered Adaptive Streaming of Dynamic Point Clouds. International Conference on Quality of Multimedia Experience (QoMEX).
  • S. Subramanyam, I. Viola, A. Hanjalic, and P. Cesar. 2020. User Centered Adaptive Streaming of Dynamic Point Clouds with Low Complexity Tiling. ACM International Conference on Multimedia (ACM MM).
  • S. Subramanyam, J. Li, I. Viola and P. Cesar. 2020. Comparing the Quality of Highly Realistic Digital Humans in 3DoF and 6DoF: A Volumetric Video Case Study. IEEE Conference on Virtual Reality and 3D User Interfaces (IEEE VR).
  • N. Reimat, Y. Mei, E. Alexiou, J. Jansen, J. Li, S. Subramanyam, I. Viola, J. Oomen, P. Cesar. 2022. Mediascape XR: A Cultural Heritage Experience in Social VR. ACM International Conference on Multimedia, (ACM MM).
  • Reimat, E. Alexiou, J. Jansen, I. Viola, S. Subramanyam, and P. Cesar. 2021. CWIPC-SXR: Point Cloud dynamic human dataset for Social XR. ACM Multimedia Systems Conference (ACM MMSys).
  • I. Revilla, S. Zamarvide, I. Lacosta, F. Perez, J. Lajara, B. Kevelham, V. Juillard, B. Rochat, M. Drocco, N. Devaud, O. Barbeau, C. Charbonnier, P. de Lange, J. Li, Y. Mei, K. Ɓawicka, J. Jansen, N. Reimat, S. Subramanyam, P. Cesar. 2021. A Collaborative VR Murder Mystery using Photorealistic User Representations. IEEE Conference on Virtual Reality and 3D User Interfaces Abstracts and Workshops (IEEE VR).
  • J. Jansen, S. Subramanyam, R. Bouqueau, G. Cernigliaro, M. Cabre, F. Perez, and P. Cesar. 2020. A pipeline for multiparty volumetric video conferencing: transmission of point clouds over low latency DASH. ACM Multimedia Systems Conference (ACM MMSys).
  • A. Chatzitofis, L. Saroglou, P. Boutis, P. Drakoulis, N. Zioulis, S. Subramanyam, B. Kevelham, C. Charbonnier, P. Cesar, D. Zarpalas, S. Kollias, P. Daras. 2020. HUMAN4D: A Human-Centric Multimodal Dataset for Motions and Immersive Media. IEEE Access.

Related news