School of Science Division of Life Science 32 Big Data: Bioinformatic Analysis of Single-cell Genomic Data Supervisor: WU Angela Ruohao / LIFS Student: SAPUTRA Alexander William / SENG Course: UROP1000, Summer The need to process high-throughput sequencing data is growing in conjunction with the rapid adoption of next-generation sequencing technologies. One method of transcriptome assembly is de novo assembly, where reads are assembled into transcripts without the use of a reference genome. This report discusses a popular software used for this purpose, called Trinity, including the algorithms behind it, evaluation of its performance with respect to other de novo assemblers (Price and IVA) and a demonstration of a typical workflow centered around Trinity (assembly and annotation). Big Data: Bioinformatic Analysis of Single-cell Genomic Data Supervisor: WU Angela Ruohao / LIFS Student: WU Ka Hei / GBUS Course: UROP2100, Fall As the global population continues to age, age-related neurodegenerative disorders are predicted to increase to 135 million cases worldwide by 2050, overtaking cancer as the second leading cause of death after cardiovascular disease (Gammon, 2014). In this study, we aim to create a single-nucleus RNA seq (snRNA-seq) library to facilitate the study of neurodegenerative disorders. We focus on analyzing the gene expression in three regions of the mouse brain – cortex, hippocampus, and cerebellum and identify any aging-related molecular and cellular changes in the brain by comparing mice of 3-months vs. 17-months of age. This progress report will focus on comparing the changes in cell composition within the cerebellum and their relevance in the development of neurodegenerative disorders. Big Data: Bioinformatic Analysis of Single-cell Genomic Data Supervisor: WU Angela Ruohao / LIFS Student: YU Jiamu / SSCI Course: UROP1100, Summer Bioinformatics, an emerging scientific subdiscipline where biological data is integrated and interpreted by demanding computational and statistical methods, has always been an intriguing topic. Harnessing the power of algorithms and data science, this interdisciplinary subject is empowered to uncover a substantial world beyond biology, fulfilling the scientific understanding of biologic processes, specifically COVID-19 pathogenesis. This report records an application of single-cell analysis on a nasal swab from a paediatric patient infected with SARS-CoV-2. By visualising and annotating the nasal swab dataset, this report attempts to attribute to the standard transcriptome database of COVID-19 for future studies, which provides hints for potential clinical and pharmaceutical applications.