UROP Proceedings 2020-21

School of Business and Management Department of Accounting 187 Machine learning and Sentiment Classification in Chinese Financial Text Supervisor: HUANG Allen Hao / ACCT Co-supervisor: YOU Haifeng / ACCT Student: SHIU Heng / QFIN Course: UROP1100, Summer Bidirectional Encoder Representations from Transformers, or BERT, is a transformer-based machine learning model developed by google that has significantly changed the landscape of NLP. Compared to previous models that read texts sequentially, BERT’s key innovation is that it uses bidirectional learning to gain the context of a word based on the entire sentence. Furthermore, BERT is pre-trained on large amounts of data. For instance, the original BERT model released by Google is pre-trained on a large corpus that includes all 2.5 billion words from Wikipedia. Value Simplex FinBERT, a BERT model designed for Chinese financial text, is pre-trained with over 4 million finance-related articles. BERT’s extensive training data allows it to perform well even when applied on relatively small datasets. BERT is also designed to be highly versatile, allowing users to fine-tune its model for different tasks by simply adding a few output layers. Downstream tasks that BERT could accomplish include Extraction-based Question Answering, Natural Language Inference, Sentiment Analysis, and others. Machine learning and Sentiment Classification in Chinese Financial Text Supervisor: HUANG Allen Hao / ACCT Co-supervisor: YOU Haifeng / ACCT Student: TU Bingying / DSCT Course: UROP1100, Summer The goal of this research is to use the chinese BERT model to perform sentiment classification on annual reports from the companies used to compute CSI300. The first stage of this research was to extract sentences from annual reports and prepare them for training use. The criterion was set to label the sentiment of each sentence and the basic information of the sentences are collected. After preparing the data, chinese BERT model is utilized to perform sentiment classification. We tried to use the model to analyze the sentiment of annual reports and then facilitate the investors’ decision making. Investment Analysis with Machine Learning Supervisor: YOU Haifeng / ACCT Student: CHAN Chak Him / ECOF LU Xiaoyi / ECON ZOU Haoxiang / RMBI Course: UROP1100, Fall UROP1100, Fall UROP1100, Fall The commodity futures market in China has its own uniqueness in terms of the regulations and style of transaction, while some important indicates of the market (e.g. hedging pressure, liquidity, momentum, term structure, etc.) might perform differently from markets in the U.S. and in Europe. While the Chinese commodity futures market remains immature in terms of regulations and the behavior of inexperienced individual investors, few investigations have been conducted. This report aims to better understand the commodity futures in China by developing features and using them to design a portfolio trading strategy. At the same time, using statistical and machine learning methods to test the effectiveness and the consistency of the strategy, as well as examining the features’ impact on future returns.