Arun Baby

ML Researcher → AI Engineer → Architecting intelligent agent systems

Speech Technology Podcast

A deep-dive podcast series on the science and engineering of making machines listen and speak.

Each episode takes a single topic from the Speech Tech collection and explores it in depth – the theory, the implementation, the production trade-offs, and the research frontier.

What to Expect

Automatic speech recognition: streaming, multi-lingual, on-device
Text-to-speech synthesis: neural architectures, prosody, multi-speaker
Speech enhancement, separation, and noise reduction
Speaker recognition, verification, and voice biometrics
Real-time audio processing and production pipelines

Episodes

Coming soon.

In the meantime, explore the written deep-dives in the Speech Tech collection – 61 articles covering everything from streaming ASR to voice activity detection.