.

ISSN 2063-5346
For urgent queries please contact : +918130348310

CURSIVE HANDWRITTEN TEXT RECOGNITION USING RECURRENT NEURAL NETWORKS

Main Article Content

Aejaz Farooq Ganai, Nasir Nabi
» doi: 10.53555/ecb/2021.10.4.15

Abstract

The handwritten Urdu recognition system is the process of reading characters or numbers that have been written on paper and provided as input in the form of a scanned digital image. The recognition of handwritten Urdu offers abundant applications in widespread domains. Big data of offline handwritten Urdu literature written by many different Urdu authors can be made available online, with the help of this offline OCR tool. Offering a variety of applications and convenience to differently-abled people motivates us to pursue our research on the development of the Urdu OCR system. The state-of-the-art approaches employ Convolutional Neural Networks (CNN) for the recognition of handwritten Urdu text. However, in these approaches, it becomes difficult and time-consuming to recognize a long sequence of handwritten Urdu text input because CNN’s can’t learn long-term dependencies (remembering previous information). Hence to recognize a long sequence of handwritten Urdu text input, an LSTM based Recurrent Neural Network (RNN) model has been proposed to recognize any unconstrained handwritten Urdu text. The proposed approach relies on a holistic approach where features are extracted from handwritten Urdu ligature images using the RNNs and these features train multi-dimensional LSTM-based RNNs. The trained MDLSTM-based RNN is used for the classification and recognition of unconstrained handwritten Urdu text because recurrent neural networks are better able to capture the long-term dependencies in the input sequences. We have employed the benchmark handwritten Urdu dataset UNHD as well as the newly proposed handwritten Urdu dataset UHLD for feature extraction as well as training the LSTM-based RNN model and the trained model recognizes any unconstrained handwritten Urdu text with a better recognition rate of nearly 94% on a large handwritten Urdu ligature dataset of 1500 cluster classes and hence outperforms the state-of-the-art accuracy of 92.75% for handwritten Urdu recognition.

Article Details