UROP Proceedings 2021-22

School of Science Division of Life Science 29 Imaging Genomics: Developing Machine Learning Methods for Precision Cancer Diagnosis Supervisor: WANG Jiguang / LIFS Student: LIU Zhiyin / BCB Course: UROP1100, Summer The handy tool Connectivity Map (CMap) links genes, small molecules, and disease states through shared gene expression signatures. It is commonly utilized in producing and analyzing perturbational datasets to advance the identification of innovative medicines and help us understand human disease. Since the small amount of the dataset limits its capabilities, researchers enlarged the data volume tremendously in the next generation Connectivity Map with L1000 high-throughput gene expression profiling technology. A specific type of cerebral cavernous malformation (CMM) is defined by somatic MAP3K3 mutation in our previous research and thirty up-regulated genes were founded and used in the next generation Connectivity Map by L1000 for analysis and identification of candidate drugs for CMM. The Application of BIG DATA Technologies in Precision Cancer Medicine Supervisor: WANG Jiguang / LIFS Student: LEUNG Kung Nam / BTGBM Course: UROP1000, Summer With the lifetime risk of cancer uprising worldwide (Ahmad, A. S., Ormiston-smith, N. & Sasieni, P. D., 2015), cancer prognosis, or the evaluation of cancer development and survival analysis, is playing a more significant role in making treatment decisions. While clinical factors for prognosis include age, race, ethnicity, tumor histological type, and therapeutic history, this project aims to access the feasibility of incorporating clinical features alongside pan-cancer features, which are the data on frequently mutated genes and other genomic abnormalities (Illumina, Inc., 2022), for survival time prediction using machine learning methods. The project focuses on Glioblastoma multiforme (GBM) and carries out a fundamental analysis of the provided GBM dataset, followed by an evaluation of different regression techniques in survival prediction. The goal is to provide a foundation for further development of more advanced models on the dataset, including deep learning models. Study of Blood Cell Development Using Zebrafish Model Supervisor: WEN Zilong / LIFS Student: FAN Yining / BCB Course: UROP3100, Fall mpeg1.1 has long been used as a marker gene for macrophages in mouse and zebrafish. However, recently there are increasing evidence of the heterogeneity of mpeg1.1+ cells in adult zebrafish. Therefore, to give a throughout classification of mpeg1.1+ cells, we sorted DsRed+ cells in epidermis, gut, gill, kidney and brain of adult Tg(mpeg1.1:loxP-DsRed-loxP-GFP) zebrafish using FACS (Fluorescence Activated Cell Sorting) and conducted single cell RNA sequencing. Six distinct cell population was isolated and further experiments were conducted to verify the distribution of these cell populations and explore the functions of each cell population.