UROP Proceedings 2022-23

School of Engineering Department of Electronic and Computer Engineering 133 Robust and Generalized Methods for Medical Image Analysis Supervisor: LI, Xiaomeng / ECE Student: SUN, Haosen / DSCT Course: UROP3100, Fall UROP4100, Spring Predicting survival outcomes in computational pathology is a challenging task due to the complex interactions within the tumor microenvironment in gigapixel whole slide images (WSIs). In this work, we propose a Multimodal Sparse-OT Co-Attention Transformer (MSOCAT) framework that learns an interpretable, sparse co-attention mapping between WSIs and genomic features. MSOCAT is inspired by approaches in Visual Question Answering (VQA) and uses optimal transport (OT) with novel constrained variants to jointly select and align text pieces, such as histology patches and genes, as a justification for the downstream prediction. Our method provides interpretable alignments by introducing constraints that produce highly sparse alignment patterns. Moreover, our coattention transformation reduces the space complexity of WSI bags, enabling the adaptation of Transformer layers as a general encoder backbone in multiple instance learning. We apply our proposed method on five different cancer datasets (4,730 WSIs, 67 million patches) and demonstrate that it consistently achieves superior performance compared to state-ofthe-art methods while providing interpretable and sparse alignments. Our approach can be used for a wide range of future applications within computational pathology and beyond, where the interpretation of the learned alignment is crucial for decision-making. Deep Learning for Multi-class Retinal Disease Classification in Real-world Setting Supervisor: LI, Xiaomeng / ECE Student: KRIUK, Boris / SENG MA, Wanqin / ELEC Course: UROP1100, Summer UROP4100, Summer This summer, we are assigned to an undergraduate research program related to Medical Computer Vision and Machine Learning, supervised by Professor Xiaomeng LI. KRIUK Boris focuses on Loss Function Analysis in Computer Vision Models; details are shown in Part 2. MA Wanqin focuses on the Combination of Computer Vision and the Natural Language Model: the Medical CLIP Model; details are shown in Part 3. Deep Learning for Multi-class Retinal Disease Classification in Real-world Setting Supervisor: LI, Xiaomeng / ECE Student: SHI, Haochen / COMP Course: UROP1100, Fall Image registration is the process of transforming different sets of data into one coordinate system. It has conducted a wide application in medical fields, such as aligning of anatomical images from different views, time or machine, computer-aided diagnosis and disease following-up, computational model building, radiation therapy, etc. Traditional image registration aims to obtain a spatial transformation function by optimizing the objective function via mathematical methods for each pair of data. However, it is timeconsuming for large datasets or high dimension data. In this report, I would like to introduce the traditional image methods, deep-learning-based image methods, denoising diffusion probabilistic model and its application on image registration.