SpeechASRSDERVCNLP

Japanese Conversational Speech

Multi-speaker Japanese dialogue with stereo speaker separation and emotion annotations

Languages

Japanese

Pricing

Custom

Overview

Unique Japanese conversational recordings featuring spontaneous discussions and formal meetings. Includes natural dialogues with regional dialect discussions and structured professional conversations. Features stereo speaker-separated recordings—speakers isolated to left/right audio channels for superior diarization and speaker extraction training. Rich discussion of Japanese regional dialects (北海道弁, 津軽弁, 旭川弁) with high-density emotion annotations. Ideal for training speaker separation algorithms, voice cloning, multi-speaker ASR, and conversational AI models.

Highlights

  • Stereo speaker separation: L/R channel isolation for perfect speaker extraction
  • High-density emotion annotations per utterance
  • Per-utterance emotion labels with confidence scores
  • 18 emotion categories: Joy, Interest, Confusion, Amusement, Calmness, and more
  • Regional dialect content (Hokkaido, Tsugaru, Asahikawa)
  • Native Japanese filenames preserved
  • Custom annotations available

Deliverables

Files

Stereo WAV files with L/R speaker separation (48kHz, 16-bit to 24-bit), JSON transcripts with speaker diarization labels, Per-utterance emotion labels with confidence scores, Native Japanese character filenames

Audio Specs

48kHz sample rate, 16-24 bit depth, stereo with L/R channel separation

Transcription Format

JSON with speaker labels, emotion annotations, confidence scores

Stereo speaker separation ideal for speaker extraction and voice cloning. Contact us for samples and technical documentation.