UROP Proceedings 2020-21

School of Science Department of Mathematics 60 Research in AI and Machine Learning Supervisor: ZHANG Tong / MATH Student: SHUM Ka Shun / DSCT Course: UROP2100, Spring Conventional context-aware text generation only focuses on What to generate when given the surrounding texts. Inspired by existing corrupted long-form narrative, we propose new setting of detection and generation where the model need to find Where to generate and then What to generate. The task is more challenging since it needs to detect where exits possible semantic gap and then infer and generate. Then we introduce our Two-stage prompt-based generation framework and do experiments on ROCStory dataset. The result shows the effectiveness of our model to handle such new task and generate reasonable answers. Research in AI and Machine Learning Supervisor: ZHANG Tong / MATH Student: RUBAB Tamzid Morshed / DSCT Course: UROP1100, Summer This is a progress report describing the contents I have learned over the summer. These include various neural network architectures, training algorithms, implementing them with pytorch, several 2d object detection algorithms such as R-CNN, fast R-CNN, and faster R-CNN. In this report, I will first describe neural networks. This part includes data collection and pre-processing, designing a neural network model, choosing loss functions, back-propagation, training, and convolutional neural network. Then I will describe some 2d object detection algorithms by reviewing the papers of R-CNN, fast R-CNN, and faster R-CNN. Faster R-CNN was the first step towards real-time detection, which is very important in tackling various modern problems. Research in AI and Machine Learning Supervisor: ZHANG Tong / MATH Student: XU Yanbo / COSC Course: UROP1100, Summer Object detection and localization has gain enormous attention in recent years, due to its potential application in autonomous driving, robotics, and so on. Among those many methods, Monocular ThreeDimension Object Detection is a challenging problem because of its lacking of depth information in the input, yet human is capable of estimating distance and locating objects with prior knowledge. With the development of deep learning and developing research on 2D object detection, depth estimation, and so on, 3D detection has become feasible. Meanwhile, 3D annotations in often hard, and other types of input data (such as lidar signals and Multiview images) are expensive to deploy, making the supervised approaches limited by data. In the context of autonomous driving, the most common type of data is monocular video. Therefore, in this research project, we propose a semi-supervised monocular 3D detection framework, to make good use of both annotated data and raw videos.