Title: The Study of Automobile-Used Voice-Activity Detection System Based on Two-Dimensional Long-Time and Short-Frequency Spectral Entropy

Year of Publication: Sep - 2016
Page Numbers: 54-61
Authors: Kun-Ching Wang
Conference Name: The Fourth International Conference on Technological Advances in Electrical, Electronics and Computer Engineering (TAEECE2016)
- Malaysia


The use of features, which are more robust to noise, is an important issue in order to develop a novel automobile-used voice-activity detection (VAD) to detect human voices. Wu et al. were the first to use band-spectral entropy (BSE) to describe the characteristics of voiceprints due to that the inherent nature of the formant structure only occurred on the speech spectrogram (well-known as voiceprint). However, the performance of VAD based on BSE feature was degraded in colored noise (or voiceprint-like noise) environments. We propose the two-dimensional part-band energy entropy (TDPBEE) parameter based on two variables: part-band partition number upon frequency index and longterm window size upon time index to further improve the BSE-based VAD algorithm in order to solve this problem. The two variables can efficiently represent the characteristics of voiceprints on each critical frequency band and use long-term information for noisy speech spectrograms, respectively. The TD-PBEE parameter can be regarded as a PBEE parameter over time. The accuracy of the proposed TD-PBEEbased VAD algorithm averaged over all noises and all SNR levels is found that is better than that of other considered VAD algorithms.