UROP Proceedings 2022-23

School of Engineering Department of Computer Science and Engineering 127 Scene Depth Diffusion Supervisor: XU, Dan / CSE Student: TONG, Tsun Man / COMP Course: UROP1000, Summer The project “Scene Depth Estimation” is conducted during 2023 Summer term under the supervision from Professor Dan Xu. Diffusion model has been widely used in image processing and computer vision in sight of its generative nature. It was aimed to design a diffusion-based transformer model for robust scene depth estimation. Based on the proposed model from Saxena et al. (2023), the model have been implemented with custom modifications. Due to time and equipment constraints, it have not been through a proper training process. However, the model have been tested with a few images form NYUDv2 dataset and is expected to establish new state-of-the-art results over current baseline models, and generalize well to different scenarios benefiting from the power of diffusion. The model is implemented in Python with PyTorch as the deep learning framework, along with some other well-known public open source libraries such as OpenCV and NumPy. Please refer to https://github.com/mt1516/Monocular_Depth_Estimation_using_Diffusion_Models for the code implementation. Special thanks to Professor Dan XU for his guidance and support throughout the project. Real-time 3D Scene Reconstruction from End-to-End Deep Learning Supervisor: XU, Dan / CSE Student: SUN, Zhuotao / COMP Course: UROP1100, Spring In this program, I focus on the paper NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video (Sun, 2021). During this semester, I read through the paper in detail and learned related knowledge. Although I did not build the model in the semester limited by my ability and time, I gained a lot from the research experience. In this report, I will first briefly describe the process of the research this semester, then describe what I gained in the process, and discuss the possible working direction in the future. Self-Supervised Scene Depth Estimation from the Wild Supervisor: XU, Dan / CSE Student: LU, Junchi / SENG Course: UROP1100, Summer For an AI system, scene depth is one of the most fundamental elements or indicators to perceive the environment and collect information. The aim of this project is to develop a deep learning system that enables scene depth estimation from videos, with a self-supervised operating manner. This summer, I investigated a project called Monocular Depth Estimation using Diffusion Models. Their DepthGen model achieves SOTA performance on the indoor NYU dataset, and near SOTA results on the outdoor KITTI dataset. Based on the group’s paper and their project webpage (by Saurabh Saxena, Abhishek Kar, Mohammad Norouzi, and David J. Fleet), I gained an overview about the approach and features of the DepthGen model. With further analysis, I will implement the model with similar methods and algorithms.