Title: FEATURE SPACES AND MACHINE LEARNING REGIME FOR AUDIO CONTENT CLASSIFICATION AND INDEXING

Year of Publication: 2012
Page Numbers: 335-347
Authors: Muhammad Al-Maathidi, Francis F. Li
Conference Name: The International Conference on Computing, Networking and Digital Technologies (ICCNDT2012)
- Bahrain

Abstract:


Rapid advancement in computer science and internet technology has resulted in a large volume of media files including broadcasted radio, television, recorded meetings, and voice mails as well as many others. Digitisation and archiving of old media contents also contributes to the growth of the digital library. The usefulness of these collections is largely dependent upon the availability of information retrieval and search tools. Soundtracks are thought to be information-rich; much content related information can be extracted from them enabling metadata generation and semantic search. Research over the past few decades has accumulated a large collection of algorithms for the recognition, scripting and classification of speech, music, and event sounds. However, soundtracks in multimedia files are typically an overlapped mixture of different sounds. Segmentation and classification are essential pre-processors for audio-based information retrieval and metadata generation. This paper proposes a classification method, discusses suitable machine learning regime for clustering and classification, and presents a generic structure of a pre-processing stage for automated metadata generation.