School of Business and Management Department of Accounting 167 Deep Learning in Natural Language Processing Supervisor: HUANG Allen Hao / ACCT Co-supervisor: YANG Yi / ISOM Student: KHURANA Vinayak / CPEG Course: UROP1000, Summer In this paper, a performance evaluation of five machine learning algorithms was conducted with regards to corporate ESG text classification. Using a sample of 4500 manually labelled sentences from ESG and sustainability reports across 11 industries, it is shown that the Support Vector Machine (SVM) was the optimal algorithm of the other four models including Naïve Bayes, XGBoost, Google’s bidirectional encoder representation from transformers (BERT), and HKUST’s FinBERT, an algorithm customized from BERT for financial text. Two performance metrics were chosen to evaluate performance involving the accuracy score and execution times. Although BERT achieved the highest accuracy and Naïve Bayes achieved the fastest execution times, SVM was chosen due to its optimal performances with regards to each metric. Deep Learning in Natural Language Processing Supervisor: HUANG Allen Hao / ACCT Co-supervisor: YANG Yi / ISOM Student: LI Qiuru / IS Course: UROP1000, Summer In this paper, we use a finance-domain pre-trained deep learning NLP algorithm – FinBERT to fine-tune for an NLP classification task and compare the performance with another NLP algorithm – BERT. First, we collected ESG reports from various companies in different industries to extract financial sentences and labelled the sentences based on 13 categories in ESG, including non-ESG and non-sentence with our own knowledge. Then, we randomly selected 4500 sentences for FinBERT fine-tuning. In the fine-tuning step, we used the pre-trained FinBERT and BERT model to fine-tune for the ESG-sentences classification task and did the comparison of the accuracy, we found that FinBERT performed slightly better than BERT. Our results have implications for those who want to extract insights from financial texts.
RkJQdWJsaXNoZXIy NDk5Njg=