This article is also available in:Our speech-to-text model is primarily based off of Nvidia's NeMo CitriNetHowever, it has been modified and improved to meet our use case.Updated on: 02 / 04 / 2022