UROP Proceedings 2022-23

School of Engineering Department of Computer Science and Engineering 112 Using Large Language Models (LLMs) for Software Development Supervisor: CHEUNG, Shing-Chi / CSE Student: TSE, Ngo Chun / COGBM Course: UROP1100, Summer Using Large Language Model (LLM) in daily application is more popular in the past years, especially after the open access of ChatGPT. Yet, LLM is far from perfect in terms of accuracy and creditability. In this project, we focus on measuring the accuracy of generating application test cases by manipulating Generative Pretrained Transformer (GPT), and compared the results in different settings. We found out a possible way to improve the accuracy according to our experimental settings and seek further investigation on that. Automated Program Synthesis Supervisor: CHEUNG, Shing-Chi / CSE Student: LI, Yijia / MATH-AM Course: UROP1000, Summer Large Language Models (LLMs) performs well on coding tasks that are prevalent in their training datasets but are not for less-trained domain-specific tasks. This project aims to leverage LLM for domain-specific program synthesis with unstructured or semi-structured inputs. In order to learn from the lattest research progress of the field and improve our method, I collected, read, and summarized conference papers of the year which focus on repository-level code completion with prompt engineering on LLMs. I also attempt to reproduce the results of several papers, given that the source code and data are provided by the authors. Since the dataset we collected may already be used in training the LLM, I also participated in data wrangling tasks, for example, mutating data in the test set. Dynamic Maintenance of Alphabetic Search Trees Supervisor: GOLIN, Mordecai Jay / CSE Student: SHU, Tian / DSCT Course: UROP1100, Fall This UROP project studies the optimal dynamic alphabetic search tree problem and experiments on a heuristic that dynamically maintains a left-adjusted tree structure. The heuristic applies rebuilds on sufficiently large subtrees to keep every element at a depth within constant additive error of the optimal depth. The experiment reveals that current optimizations on this heuristic are still not enough to deal with worst-case scenarios. The research bottleneck is eliminating the left-adjusted structure and figuring out a heuristic that distributes the elements evenly on the tree.