.

ISSN 2063-5346
For urgent queries please contact : +918130348310

EXPLORE TRANSFER LEARNING TECHNIQUES TO LEVERAGE KNOWLEDGE FROM PRE-TRAINED VOICE MODELS AND EFFECTIVELY ADAPT THEM TO NEW SPEAKERS OR LANGUAGES WITH LIMITED DATA

Main Article Content

Muhammad Rafiq Khan, Dr Inshaal Khalid, Dr Ali Asghar Mirjat, Dr Mahnoor Shabir, Dr Nitasha Saddique, Maria Shahid, Dr Fahmida Khatoon, Shahzad, Dr. Muhammad Farhan Nasir, Kashif Lodhi
» doi: 10.53555/ecb/2023.12.12.258

Abstract

Background: Transfer learning has proven to be an effective approach in various natural language processing tasks, where pre-trained models are fine-tuned on specific downstream tasks. In the context of voice models, transfer learning has gained traction as a means to leverage the vast amount of knowledge captured by pre-trained voice models and adapt them to new speakers or languages with limited data. This approach has the potential to significantly reduce data requirements and improve performance in scenarios where collecting large amounts of speaker-specific or language-specific data is challenging. Aim: The aim of this study is to explore transfer learning techniques for voice models and investigate their effectiveness in adapting pre-trained voice models to new speakers or languages with limited data. We seek to understand how well these techniques can capture and transfer speaker- or language-specific characteristics while maintaining the general knowledge learned from the original pre-trained model. Methodology: We conduct a series of experiments using various transfer learning techniques, including fine-tuning, feature extraction, and adaptation layers. We utilize a pre-trained voice model trained on a large multilingual dataset and evaluate its performance on multiple downstream tasks involving new speakers or languages with limited data. The experiments are conducted using a diverse set of speakers and languages to ensure robustness and generalizability of the findings. Results: Our results demonstrate that transfer learning techniques effectively adapt pre-trained voice models to new speakers or languages with limited data. Fine-tuning with a small amount of speaker- or language-specific data yields remarkable improvements in model performance. Feature extraction and adaptation layers also show promising results, indicating the models' ability to capture and transfer relevant characteristics while retaining general knowledge. Conclusion: Transfer learning techniques represent a powerful approach to leverage pre-trained voice models in scenarios with limited data availability. These techniques offer an efficient way to adapt models to new speakers or languages, reducing the need for extensive data collection. Our findings support the utility of transfer learning in the context of voice models and highlight its potential to enhance performance and extend the applicability of voice technologies to diverse linguistic and speaker demographics.

Article Details