A deep-dive podcast series on the science and engineering of making machines listen and speak.

Each episode takes a single topic from the Speech Tech collection and explores it in depth – the theory, the implementation, the production trade-offs, and the research frontier.


What to Expect

  • Automatic speech recognition: streaming, multi-lingual, on-device
  • Text-to-speech synthesis: neural architectures, prosody, multi-speaker
  • Speech enhancement, separation, and noise reduction
  • Speaker recognition, verification, and voice biometrics
  • Real-time audio processing and production pipelines

Episodes

Coming soon.

In the meantime, explore the written deep-dives in the Speech Tech collection – 61 articles covering everything from streaming ASR to voice activity detection.