.

ISSN 2063-5346
For urgent queries please contact : +918130348310

Enhancing Speech Recognition with Bidirectional Recurrent Neural Networks and LSTM Models

Main Article Content

Ms. Sheetal Pandya, Ms. Krupa Chotai
» doi: 10.48047/ecb/2023.12.si4.1626

Abstract

One possible explanation for RNN language models' outsized effectiveness in voice recognition is its superiority over word n-gram models at simulating long-distance context. In RNNs, the hidden-layer recurrent connections are used to simulate the context of prior inputs. RNNs with units that can hold values over infinitely long periods of time are called LSTM neural networks. These RNNs have the potential to mimic a wide range of complicated behaviors. Bidirectional networks may be designed to further condition on the inputs they will acquire in the future, allowing them to make more accurate predictions about future outputs than typical unidirectional networks can by themselves. In this research, we propose using both LSTM and bidirectional RNNs to language modeling for the purpose of speech recognition. Before comparing the efficiency of unidirectional and bidirectional models on a transcription task using English Broadcast News, we discuss some of the potential issues that may arise when employing bi-directional models for speech. We find that bidirectional RNNs perform much better than their unidirectional counterparts, but bidirectional LSTMs show no performance gains over their unidirectional counterparts.

Article Details