SpeechASRTTSERVC

English Monologue Speech

Professional single-speaker recordings with word-level timestamps and emotion annotations

Languages

English

Pricing

Enterprise

Overview

Professional English monologue recordings featuring spontaneous free speech and formal presentations. Includes personal narratives, workplace communications, medical notes, educational talks, and motivational speeches. Features word-level timestamps and per-utterance emotion labels with confidence scores. Ideal for training ASR, TTS, emotion recognition, and voice cloning models.

Highlights

  • Word-level timestamps for each utterance
  • Per-utterance emotion labels with confidence scores
  • 18 emotion categories: Joy, Determination, Interest, Calmness, Confusion, and more
  • High-quality 48kHz 24-bit mono recordings
  • Diverse content: personal narratives, medical dictations, workplace communications, educational presentations
  • Single-speaker recordings ideal for TTS and voice cloning
  • Custom annotations available

Deliverables

Files

WAV audio files (48kHz 24-bit mono), JSON transcripts with word-level timestamps, Per-utterance emotion labels with confidence scores, Speaker metadata

Audio Specs

48kHz sample rate, 24-bit depth, 1152 kbps, mono

Transcription Format

JSON with word timestamps, speaker labels, emotion annotations

Contact us for samples and volume pricing. See our conversational dataset for multi-speaker content.