UROP Proceedings 2020-21

School of Engineering Department of Computer Science and Engineering 149 Research on Mining Course Structure Supervisor: WONG Raymond Chi Wing / CSE Student: HSU Shang-ling / COSC Course: UROP4100, Spring Student performance prediction is a crucial task in education. However, many previous studies suffer from 1) the insufficient utilization nor fusion of the structural and temporal information, as well as 2) the lack of explainability. To this end, we address the problem by proposing Dynamic Heterogenous Attention Network (D-HAN), formulating student performance prediction as a link prediction problem in dynamic heterogeneous graphs, like the link prediction in the Author-Paper-Venue graphs. D-HAN is mainly composed of heterogenous graph neural network modules for structural message passing, multi-head self-attention modules for temporal message passing, and the structural-temporal aggregation module for link prediction, which yields the performance prediction along with the importance scores of the neighbor nodes and thus provides some explainability. Real-time 3D Scene Reconstruction from End-to-End Deep Learning Supervisor: XU Dan / CSE Student: KIM Jaehyeok / COMP Course: UROP1000, Summer 3D scene reconstruction is a task to render 3d details of a certain scene such as depths from a set of 2D images with different viewpoints. Real-time scene reconstruction is an important task in various real-life applications such as self-driving, AR/VR, robotic navigation, and intelligent manufacture. Recently, a model representing a scene as Neural Radiance Fields (NeRF) has achieved state-of-the-art results on the 3D scene reconstruction task. Furthermore, NeRF-based variants have actively been researched to improve the drawbacks of the base model and achieve new state-of-the-art in different metrics. Although NeRF and its variants still have unresolved drawbacks, it is a promising and interesting research topic in the computer vision community. In this report, I will review the current state-of-the-art models in NeRF-based 3D scene reconstruction and discuss my progress on different research directions. Self-Supervised Scene Depth Estimation from the Wild Supervisor: XU Dan / CSE Student: YU Mukai / ISDN Course: UROP1100, Summer In recent years, deep learning has been progressing fast in computer vision, which has a lot of practical applications, including simultaneous localization and mapping (SLAM), pose estimation, etc. However, the annotated datasets for supervised deep learning are still far from adequate, at least compared to the NLP area, and this urges the computer vision academia to further structure the network and apply some hacks into it. This report investigates a novel method that simultaneously learns depth, ego-motion, and camera intrinsic from unannotated videos recorded using monocular cameras. This approach is similar to a previous method of unsupervised learning researched and published in CVPR-2017, which only learns depth and egomotion.