Academic Journal

Multistream recognition of noisy speech with performance monitoring

التفاصيل البيبلوغرافية
العنوان: Multistream recognition of noisy speech with performance monitoring
المؤلفون: Ehsan Variani, Feipeng Li, Hynek Hermansky
المساهمون: The Pennsylvania State University CiteSeerX Archives
المصدر: http://hltcoe.jhu.edu/uploads/publications/papers/16661_slides.pdf.
سنة النشر: 2013
المجموعة: CiteSeerX
الوصف: A prototype multi-stream system with a performance monitor for stream selection is proposed to recognize speech in un-known noise. The speech signal is decomposed into seven band-limited streams. Posterior probabilities of phonemes are estimated by a multi-layer perceptron (MLP) in each of these band-limited streams. Estimated posterior vectors of all 127 combinations (processing streams) of the seven band-limited streams form inputs to a second-stage MLP that esti-mates posterior probabilities of phonemes in each processing stream. A performance monitor is designed to predict the re-liability of individual processing streams based on the outputs from these streams. The top N streams that are least affected by noise are selected and their outputs are averaged to yield the final posterior probability vector used in Viterbi search for the best phoneme sequence. Experimental results show that the proposed technique is effective in dealing with noise. Index Terms: Multi-stream speech recognition, Performance monitoring
نوع الوثيقة: text
وصف الملف: application/pdf
اللغة: English
Relation: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.644.7711; http://hltcoe.jhu.edu/uploads/publications/papers/16661_slides.pdf
الاتاحة: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.644.7711
http://hltcoe.jhu.edu/uploads/publications/papers/16661_slides.pdf
Rights: Metadata may be used without restrictions as long as the oai identifier remains attached to it.
رقم الانضمام: edsbas.F071A88F
قاعدة البيانات: BASE