custom GradeSpeech

Tamil Conversational Speech

Stereo multi-speaker Tamil dialogue recordings with L/R speaker separation and emotion annotations

Languages

Tamil

Quality Check

100% Verified

Overview

Natural Tamil conversational recordings featuring spontaneous two-speaker dialogues between native Tamil speakers from major Tamil-speaking regions. Stereo recordings with dedicated left/right audio channels per speaker for clean speaker separation—recorded via LiveKit with isolated per-participant tracks. Covers everyday conversations, personal discussions, and topical debates with representation from Chennai, Madurai, Coimbatore, and other Tamil Nadu dialect regions. Features speaker diarization, word-level timestamps, and per-utterance emotion labels. Ideal for training multi-speaker Tamil ASR, speaker extraction, diarization, and conversational AI models.

Key Highlights

Stereo speaker separation: L/R channel isolation for perfect speaker extraction

Per-utterance emotion labels with confidence scores

Technical Specifications

Files

Stereo WAV files with L/R speaker separation (48kHz, 16-bit), JSON transcripts with speaker diarization labels, Word-level timestamps per speaker, Per-utterance emotion labels with confidence scores, Speaker metadata with region information

Audio Specs

48kHz sample rate, 16-bit depth, 1536 kbps, stereo with L/R channel separation

Transcription Format

JSON with speaker labels, word timestamps, emotion annotations