Danh mục

Báo cáo hóa học: Research Article Evolutionary Splines for Cepstral Filterbank Optimization in Phoneme Classification

Số trang: 14      Loại file: pdf      Dung lượng: 1.47 MB      Lượt xem: 5      Lượt tải: 0    
Xem trước 2 trang đầu tiên của tài liệu này:

Thông tin tài liệu:

Tuyển tập báo cáo các nghiên cứu khoa học quốc tế ngành hóa học dành cho các bạn yêu hóa học tham khảo đề tài: Research Article Evolutionary Splines for Cepstral Filterbank Optimization in Phoneme Classification
Nội dung trích xuất từ tài liệu:
Báo cáo hóa học: " Research Article Evolutionary Splines for Cepstral Filterbank Optimization in Phoneme Classification"Hindawi Publishing CorporationEURASIP Journal on Advances in Signal ProcessingVolume 2011, Article ID 284791, 14 pagesdoi:10.1155/2011/284791Research ArticleEvolutionary Splines for Cepstral Filterbank Optimization inPhoneme Classification Leandro D. Vignolo,1 Hugo L. Rufiner,1 Diego H. Milone,1 and John C. Goddard2 1 ResearchCenter for Signals, Systems and Computational Intelligence, Department of Informatics, National University of Litoral, CONICET, Santa Fe, 3000, Argentina 2 Departamento de Ingenier´a El´ctrica, Universidad Aut´ noma Metropolitana, Unidad Iztapalapa, Mexico D.F., 09340, Mexico ıe o Correspondence should be addressed to Leandro D. Vignolo, leandro.vignolo@gmail.com Received 14 July 2010; Revised 29 October 2010; Accepted 24 December 2010 Academic Editor: Raviraj S. Adve Copyright © 2011 Leandro D. Vignolo et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Mel-frequency cepstral coefficients have long been the most widely used type of speech representation. They were introduced to incorporate biologically inspired characteristics into artificial speech recognizers. Recently, the introduction of new alternatives to the classic mel-scaled filterbank has led to improvements in the performance of phoneme recognition in adverse conditions. In this work we propose a new bioinspired approach for the optimization of the filterbanks, in order to find a robust speech representation. Our approach—which relies on evolutionary algorithms—reduces the number of parameters to optimize by using spline functions to shape the filterbanks. The success rates of a phoneme classifier based on hidden Markov models are used as the fitness measure, evaluated over the well-known TIMIT database. The results show that the proposed method is able to find optimized filterbanks for phoneme recognition, which significantly increases the robustness in adverse conditions.1. Introduction speech representation so that phoneme discrimination is maximized for a given corpus. In this sense, the weightingMost current speech recognizers rely on the traditional of MFCC according to the signal-to-noise ratio (SNR) inmel-frequency cepstral coefficients (MFCC) [1] for the each mel band was proposed in [5]. Similarly, [6] proposed afeature extraction phase. This representation is biologically compression of filterbank energies according to the presencemotivated and introduces the use of a psychoacoustic scale of noise in each mel subband. Other modifications to theto mimic the frequency response in the human ear. classical representation were introduced in recent years [7– However, as the entire auditory system is complex and 9]. Further, in [10], linear discriminant analysis was studied in order to optimize a filterbank. In a different approach,not yet fully understood, the shape of the true optimalfilterbank for automatic recognition is not known. More- the use of evolutionary algorithms has been proposed inover, the recognition performance of automatic systems [11] to evolve speech features. An evolution strategy wasdegrades when speech signals are contaminated with noise. also proposed in [12], but in this case for the optimiza-This has motivated the development of alternative speech tion of a wavelet packet-based representation. In anotherrepresentations, and many of them consist in modifications evolutionary approach, for the task of speaker verification,to the mel-scaled filterbank, for which the number of polynomial functions were used to encode the parametersfilters has been empirically set to different values [2]. of the filterbanks, reducing the number of optimizationFor example, Skowronski and Harris [3, 4] proposed a parameters [13]. However, a complex relation between the polynomial coefficients and the filterbank parametersnovel scheme for determining filter bandwidth and reportedsignificant recognition improvements compared to those ...

Tài liệu được xem nhiều:

Tài liệu liên quan: