UROP Proceedings 2022-23

School of Engineering Department of Computer Science and Engineering 126 Scene Depth Diffusion Supervisor: XU, Dan / CSE Student: LIU, Minghao / SENG Course: UROP1000, Summer This report present the implementation of Diffusion Models in Monocular Depth Estimation (Saxena et al.,2023). In our implementation, we adopted the approach described in the paper and utilized an L1 loss function for training the diffusion models on noisy and incomplete depth data. Additionally, we incorporated depth infilling techniques to address missing or incomplete depth information, improving the overall quality and completeness of the training dataset. Furthermore, we employed step-unrolled denoising diffusion (SUD) as a method to reduce latent distribution shift between the training and inference stages. By unrolling the diffusion process for one step, we improved the stability of the training process and enhanced the model’s ability to capture complex relationships between input images and depth structures. Code on GitHub: Scene Depth Diffusion Scene Depth Diffusion Supervisor: XU, Dan / CSE Student: NARAYAN, Aadityavardhan / COMP Course: UROP1100, Spring Depth prediction has a myriad of applications in self-driving vehicles, from SLAM to 3D object detection. The overarching objective of this UROP project is to investigate the potential for diffusion models to be used in this domain. The method being researched was to take the depth prediction output of a strong, existing, baseline model such as a Dense Prediction Vision Transformer and refine it using diffusion models. This forms a novel problem statement since diffusion models are stochastic in nature and thus pixelwise precision, something that is integral to a depth prediction task, may be hard to achieve. This project, thus, serves to provide a promising development for the use of diffusion models, not only in depth prediction tasks, but in all dense prediction tasks.