UROP Proceedings 2020-21

School of Engineering Department of Electronic and Computer Engineering 152 Projects in Audio Signal Processing - A Multi-Band Compression Approach to Improving Speech Intelligibility Using Wavelet-Based Algorithm Supervisor: CHAU Kevin / ECE Student: DINH Anh Dung / COSC Course: UROP1100, Spring A paper by Yijia Chen et al. from our UROP team has explored a method for compensating the effects of hearing loss by applying a real-time wavelet-based algorithm to separate the speech signal into multiple subbands, which can then be amplified independently. Due to the nature of hearing loss in reducing the dynamic range of hearing differently at various frequencies, the wavelet method allows a more accurate restoration of the original signal compared to amplifying the whole speech uniformly, which may cause audio sections already loud to becoming too loud for listeners. The original paper tackled this problem by putting a speech audio with or without background interferences through a hearing loss simulator and then applying the wavelet-based modifications, using Google speech-to-text transcription for objective evaluation and optimization of speech intelligibility. This report attempts to improve the efficiency of the original algorithm by applying compression techniques on the wavelet sub-bands instead of the entire audio and then amplifying it to the maximum allowable speech energy. Projects in Audio Signal Processing - On Audio Signal Processing and Hearing Loss Supervisor: CHAU Kevin / ECE Student: LAW Jun Jie Johnathan / MATH-CS Course: UROP1100, Spring In studying hearing loss and improving speech intelligibility, fundamental notions exist to lay the groundwork for further development. This paper describes some of the key ideas involved in the UROP project, including the Fourier Transform and its variants, the Wavelet Transform, and some aspects of audiology such as loudness and loudness recruitment. We introduce the Discrete Fourier Transform along with the computationally efficient Fast Fourier Transform. We then provide motivation for and introduce the ShortTime Fourier Transform, as well as highlight its shortcomings. We also present Wavelets and the Wavelet Transform, a relatively new area of signal processing. Finally, we touch upon loudness and loudness recruitment to provide a well-rounded understanding of the subject matter related to audiology. Projects in Audio Signal Processing - The Effects of Audio Compression on Google Speech-to-Text Transcription Accuracy Supervisor: CHAU Kevin / ECE Student: WANG Binghong / ELEC Course: UROP1100, Spring This report studies the relationship between the degree of speech audio compression and the accuracy of Google Speech-to-Text transcription. Lossless speech examples in English, Chinese, French and German by both male and female speakers are converted into MP3 format and compressed at different bit rates, and the corresponding Google Speech-to-Text transcriptions are compared with the correct scripts. Fluctuations of transcription accuracy are observed at low bit rates, whereas the accuracy generally improves and maintains a high level when the bit rate increases. It is also observed that Google transcription can stop at times regardless of the bit rate, resulting in incomplete transcription and poor accuracy. Overall, Google Speech-to-Text converter is highly reliable for compressed speech audio files at a relatively high bit rate.

RkJQdWJsaXNoZXIy NDk5Njg=