SpeechASRTTSERVC

Japanese Monologue Speech

Professional single-speaker Japanese recordings with word-level timestamps and emotion annotations

Languages

Japanese

Pricing

Enterprise

Overview

High-quality Japanese monologue recordings featuring spontaneous personal narratives and long-form presentations. Includes everyday technology discussions, extensive movie reviews, and detailed commentaries. Features word-level timestamps and per-utterance emotion labels. Ideal for training Japanese ASR, TTS, emotion recognition, and long-form content transcription models.

Highlights

  • Word-level timestamps for precise alignment
  • Per-utterance emotion labels with confidence scores
  • 18 emotion categories: Joy, Interest, Confusion, Amusement, Calmness, and more
  • Long-form content available (30+ minute recordings)
  • High-quality 48kHz 24-bit recordings
  • Native Japanese character filenames preserved
  • Custom annotations available

Deliverables

Files

WAV audio files (48kHz 24-bit mono), JSON transcripts with word-level timestamps, Per-utterance emotion labels with confidence scores, Native Japanese character filenames

Audio Specs

48kHz sample rate, 24-bit depth, 1152 kbps, mono

Transcription Format

JSON with word timestamps, speaker labels, emotion annotations

Contact us for samples and volume pricing. See our conversational dataset for stereo speaker-separated multi-speaker content.