Time-Archival Camera Virtualization for Sports and Visual Performance
Автор: Suryansh Kumar
Загружено: 2026-02-26
Просмотров: 11
Описание:
Authors: Yunxiao Zhang, William Stone, Suryansh Kumar*
Visual and Spatial AI Lab, College of PVFA
Texas A&M University, College Station, TX, USA
*Corresponding author
Accepted for publication in the Journal of Computer Vision and Image Understanding (CVIU), 2026.
Preprint link: https://arxiv.org/abs/2602.15181
Project website link: https://yunxiaozhangjack.com/tacv/
Source code: https://github.com/JackZhang-SH/Time-...
Completed by students in the Master's Program in Visualization and the Bachelor's Program in Computer Science and Engineering as a part of their thesis project.
Abstract: Camera virtualization—an emerging solution to novel view synthesis—holds transformative potential for visual entertainment, live performances, and sports broadcasting by enabling the generation of photorealistic images from novel viewpoints using images from a limited set of calibrated multiple static physical cameras. Despite recent advances, achieving spatially and temporally coherent and photorealistic rendering of dynamic scenes with efficient time-archival capabilities, particularly in fast-paced sports and stage performances, remains challenging for existing approaches. Recent methods based on 3D Gaussian Splatting (3DGS) for dynamic scenes could offer real-time view-synthesis results. Yet, they are hindered by their dependence on accurate 3D point clouds from the structure-from-motion method and their inability to handle large, non-rigid, rapid motions of different subjects (e.g., flips, jumps, articulations, sudden player-to-player transitions). Moreover, independent motions of multiple subjects can break the Gaussian-tracking assumptions commonly used in 4DGS, ST-GS, and other dynamic splatting variants. This paper advocates reconsidering a neural volume rendering formulation for camera virtualization and efficient time-archival capabilities, making it useful for sports broadcasting and related applications. By modeling a dynamic scene as rigid transformations across multiple synchronized camera views at a given time, our method performs neural representation learning, providing enhanced visual rendering quality at test time. A key contribution of our approach is its support for time-archival, i.e., users can revisit any past temporal instance of a dynamic scene and can perform novel view synthesis, enabling retrospective rendering for replay, analysis, and archival of live events—a functionality absent in existing neural rendering approaches and novel view synthesis methods for dynamic scenes. While, in principle, dynamic 3DGS approaches can also perform time-archival, however, it will require either a multi-view structure-from-motion (SfM) point cloud to be stored at every time step or some form of additional multi-body temporal modeling constraint---both of which are complex, computationally expensive, and could be memory-intensive. We argue that a dynamic scene observed under a well-constrained synchronized multiview setup---typical in sports and visual performance scenarios, is already strongly constrained by geometry, and we may not need a temporally coupled constraint or 3d point cloud initialization. Extensive experiment and ablations on established benchmarks and our newly proposed dynamic scene datasets demonstrate that our method surpasses 4DGS-based baselines in rendered image quality and other performance metric for time-archival view-synthesis for a dynamic scene, thus setting a new standard for virtual camera systems in dynamic visual media. Furthermore, our approach could be an encouraging step towards compactly modeling the plenoptic function, allowing for time-archival of a long video sequence.
Повторяем попытку...
Доступные форматы для скачивания:
Скачать видео
-
Информация по загрузке: