UROP Proceedings 2022-23

School of Engineering Department of Computer Science and Engineering 121 Efficient Queries over Database Supervisor: WONG, Raymond Chi Wing / CSE Student: LUI, Ka Kit / COMP Course: UROP1000, Summer Data visualizations (DVs) are important for the analysis and visualization of a dataset. Some declarative visualization languages (DVLs, e.g., Vega-Lite, ggplot2) have been used to generate DVs, but it is usually difficult to master these DVLs, and thus unsuitable for non-technical users. The transformation of natural language queries into visualizations (NL2VIS), which is to generate visualizations in the form of DVLs given a database and natural language queries, is proposed so that non-technical users can also generate visualizations easily. This report proposes a model based on generative large language models to address the NL2VIS problem. A literature review on the NL2VIS problem and using large language model-based recommendation system is conducted to explore the feasibility of this approach. Efficient Queries over Database Supervisor: WONG, Raymond Chi Wing / CSE Student: REN, Xiyu / MATH-GM Course: UROP1100, Spring Session-based recommendation (SR) aims to anticipate users’ next moves given a sequence of previous moves in the same session, which has convincing performance and wild applications. However, there are still a few aspects which can be improved in GNN based SR to yield a better performance. In this report, we will demonstrate some basic machine learning theory and concepts, and then present the relevant method to improve the performance of SR introduced by three theses. Efficient Queries over Database Supervisor: WONG, Raymond Chi Wing / CSE Student: XIAO, Ziruo / COMP Course: UROP1100, Spring Given two locations in a spatial network, a distance (path) query returns the shortest network distance (path) between them. This kind of query has various important applications and is one of the most fundamental operations in database and data mining algorithms. While many algorithms have been proposed in the past decades, obtaining real-time data on a large-scale dynamic spatial network has become increasingly popular nowadays. Previous algorithms suffer from huge updating time to deal with queries on dynamic networks in practice. Our research is trying to find an algorithm with a promising bound for updating time. This report will present work done in this semester, mainly focusing on the study of some previous works.