School of Engineering Department of Computer Science and Engineering 124 Commonsense Reasoning with Knowledge Graphs Supervisor: SONG Yangqiu / CSE Student: WANG Yicheng / SENG Course: UROP1100, Summer This report is about the Attetion techique and the Transformer architecture in Natural Language Processing. To demonstrate Attention, we first introduce a Neural Machine Translation model called seq2seq. And the Attention, as an improvement, is aimed at capture long-term information about the source sentence. Then we go to the Self-Attention technique which reduces the number of unparallelizable operations and acts as a replacement of Recurrent Neural Network. Details and applications of Self-Attention is then illustrated with the Transformer architecture. The building blocks of Transformer, i.e. the encoders and decoders, consist of Self-Attention layers and various additional layers. Transformer is shown to perform better than some old models. Commonsense Reasoning with Knowledge Graphs Supervisor: SONG Yangqiu / CSE Student: WONG Chi Ho / DSCT Course: UROP1100, Fall Commonsense knowledge graph (CKG) enhances the performance of many NLP tasks, ranging from question answering to text generation. When it comes to natural language, commonsense components that human possesses are inevitable to provide a more reasonable prediction. However, the difficulties of utilizing Graph Neural Networks (GNN) on the CKG lie in the sparsity of the graph. In the meantime, the higher-order structure between concepts are crucial to generate unobvious concepts. Therefore, in view of these, we propose a pretraining method for GNN on the commonsense knowledge completion task, taking reference from the pretraining strategies of GPT-GNN and UniLMv2.