Direkt zum Inhalt springen
login.png Login join.png Register    |
de | en
MyTUM-Portal
Technische Universität München

Technische Universität München

Sitemap > Schwarzes Brett > Abschlussarbeiten, Bachelor- und Masterarbeiten > Sparse Camera Gaussian Splatting for Challenging (Operating Room) Scenes
auf   Zurück zu  Nachrichten-Bereich    vorhergehendes   Browse in News  nächster    

Sparse Camera Gaussian Splatting for Challenging (Operating Room) Scenes

09.09.2024, Abschlussarbeiten, Bachelor- und Masterarbeiten

Overview

This project aims at generating a radiance field representation from sparse non-moving cameras of complex operating room scenes.. The algorithm shall distinguish between static and dynamic parts and focus on the reconstruction from sparse cameras.

Background & Motivation

Gaussian Splatting [1] is interesting for real-word applications as it has the same training times as InstantNGP but outperforms it in terms of rendering. Thus, it is more applicable for virtual reality applications. Gaussian Splatting often relies on capturing a continuous video using SfM algorithms for camera pose estimation. Recently, camera free gaussians have been proposed [2]. However, they still rely on a broader set of images. In the operating room the amount of mountable cameras can be restricted by the room size or the sterile field. Thus, having a reconstruction algorithm working on a sparse set of cameras is beneficial. For generalizability approaches there is one approach building upon two neighboring images [3]. However, in the operating room it is unlikely that the cameras are placed directly next to each other. For sparse camera setups inspirations can be found in related fields working on radiance fields like NeRF [7]. Unlike in SfM-based approaches the camera poses are known a priori due to the not moving cameras.

Student’s Task

The student should start with the static background of the operating room and reconstruct this in a sparse camera setup. Using optical flow, the dynamic parts in later images can be masked out and the point cloud of these parts can be streamed directly. Similar ideas can be found in related literature [8, 9]. Therefore, scenes from 4D-OR dataset [5], Github: https://github.com/egeozsoy/4D-OR, or SegmentOR datset [6], Github: https://github.com/bastianlb/segmentOR, can be used. The idea is to provide a point cloud streaming to the radiance field for the dynamic parts and have the background reconstructed as fast as possible to work towards a real-time capable approach. Ideally the final approach is interactable or visualizable in Virtual Reality using Unity directly or WebGL.

Technical Prerequisites

The student should have some experience in data processing and deep learning. Knowledge in 3D computer vision is beneficial. Additional knowledge in terms of different sensor types etc are beneficial as well but not mandatory.

Please send your transcript of records, CV and motivation to: Hannah Schieber (hannah.schieber@tum.de) with CC to hex-thesis.ortho@mh.tum.de


Literature
[1] Kerbl, B., Kopanas, G., Leimkühler, T., & Drettakis, G. (2023). 3D Gaussian Splatting for Real-Time Radiance Field Rendering. ACM Transactions on Graphics, 42(4).
[2] Fu, Y., Liu, S., Kulkarni, A., Kautz, J., Efros, A. A., & Wang, X. (2023). COLMAP-Free 3D Gaussian Splatting. arXiv preprint arXiv:2312.07504.
[3] Charatan, D., Li, S., Tagliasacchi, A., & Sitzmann, V. (2023). pixelSplat: 3D Gaussian Splats from Image Pairs for Scalable Generalizable 3D Reconstruction. arXiv preprint arXiv:2312.12337.
[4] Wu, G., Yi, T., Fang, J., Xie, L., Zhang, X., Wei, W., ... & Wang, X. (2023). 4d gaussian splatting for real-time dynamic scene rendering. arXiv preprint arXiv:2310.08528.
[5] Özsoy, E., Örnek, E. P., Eck, U., Czempiel, T., Tombari, F., & Navab, N. (2022, September). 4d-or: Semantic scene graphs for or domain modeling. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 475-485). Cham: Springer Nature Switzerland. [6] Bastian, L., Derkacz-Bogner, D., Wang, T. D., Busam, B., & Navab, N. (2023, October). SegmentOR: Obtaining Efficient Operating Room Semantics Through Temporal Propagation. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 57-67). Cham: Springer Nature Switzerland.
[7] Naumann, J., Xu, B., Leutenegger, S., & Zuo, X. (2023). NeRF-VO: Real-Time Sparse Visual Odometry with Neural Radiance Fields. arXiv preprint arXiv:2312.13471.
[8] Karaoglu, M. A., Schieber, H., Schischka, N., Görgülü, M., Grötzner, F., Ladikos, A., ... & Busam, B. (2023). DynaMoN: Motion-Aware Fast And Robust Camera Localization for Dynamic NeRF. arXiv preprint arXiv:2309.08927.
[9] Zollmann, S., Dickson, A., & Ventura, J. (2020, November). Casualvrvideos: Vr videos from casual stationary videos. In Proceedings of the 26th ACM Symposium on Virtual Reality Software and Technology (pp. 1-3).

Kontakt: hannah.schieber@tum.de, hex-thesis.ortho@mh.tum.de

Mehr Information

https://hex-lab.io