Speech collection and ready-to-train transcripts

Recruitment, recording, transcription, QA, and dataset delivery, run as one project. We start from your language, speaker, and recording requirements and finish with training-ready files.

Workflow

Four phases, one partner, one timeline.

Recruit, record, transcribe, structure, and audit. The same workflow on every project, configured to your data spec.

01 · Speech collection

Recruit speakers and record sessions to your data spec.

Remote · On-site DA · SV · NO · DE · FI Monologue · Dialogue Custom metadata
Contributors recruited by language, dialect, age, gender, and location.
Remote captures via our platform; on-site captures with calibrated rigs.
Prompts, device rules, and noise checks configured per project.
02 · Transcription

Machine-assisted or human-validated transcripts with annotation rules to your spec.

Manual · ASR-assisted Word timestamps Speaker IDs JSON · VTT · TXT
[00:01.420 → 00:01.890] spk_02 · word-level alignment ✓
Multi-pass review for high-stakes domains, single-pass for fast turnaround.
Overlap, accent, and domain terminology handled by humans.
03 · Dataset delivery

Audio, transcripts, metadata, and manifests packaged for your training pipeline.

Bucket · API handoff Naming conventions Manifest + checksums Consent linkage
/dataset/da-DK/spk_044/take_03.wav · sha256 · meta.json
Schema agreed up front; delivery matches your training format exactly.
Every utterance traceable to consent, contributor, and capture conditions.
04 · QA & oversight

QA runs inside production so issues surface during the project.

In-production review Statistical sampling Batch gates Issue escalation
Reviewers inspect recordings while the project is live, not after.
Each batch passes a quality gate before it enters the final delivery.
Audio, transcript, metadata, and format checked against project specs.
01 / 04

Speech collection

We collect monologues, dialogues, wake words, commands, scripted prompts, roleplays, and natural conversations with speakers matched to your language and profile requirements.

Remote or on-site sessions, audio-only or synchronized audio plus video.

Transcription and annotation

We deliver machine-assisted or human-validated transcripts with timestamps, speaker labels, domain terminology, and annotation rules matched to your model training needs.

Word-level or segment-level timestamps, speaker labels, and human review on your QA criteria.

Dataset delivery

We package audio, transcripts, metadata, consent references, QA notes, and manifests in the format your engineering team needs.

WAV, JSON, JSONL, CSV, or custom formats. Bucket transfer, API handoff, or batch delivery.

QA and project oversight

Reviewers inspect recordings, transcripts, and metadata while the project is live. Each batch passes a quality gate before it joins the final delivery.

Cross-batch consistency, reviewer sampling, and human escalation on every project.

QA in detail

Six controls running on every project.

QA runs inside production, not at the end. These are the checks that run on every batch, every contributor, and every delivery.

In-production review

Reviewers inspect recordings and transcripts while the project is live, so issues are caught during production.

Measurable checks

Audio quality, transcript accuracy, metadata completeness, and format compliance are checked against the project spec.

Batch handling

Large projects run in batches. Each batch passes its own quality gate before it enters the final delivery.

Issue escalation

Detected issues are escalated and resolved during production rather than discovered after delivery.

Consistency controls

Cross-contributor and cross-batch consistency checks keep the full dataset to the same standard throughout.

Reviewer sampling

Statistical sampling of recordings and transcripts validates quality without bottlenecking production throughput.

How we run it

From brief to dataset, on a single timeline.

Three steps from your data spec to a delivered dataset your training pipeline can ingest.

01 · Scope

Agree the data spec before anyone records.

Languages, accents, speaker profiles, recording setup, transcript format, metadata, and delivery requirements. We agree the spec before the first session.

02 · Build

Recruit, capture, transcribe, review.

Recording and transcription run alongside human QA. Issues surface during the project, not at handoff.

03 · Deliver

Audited, manifested, training-ready.

Audio, transcripts, metadata, and manifests packaged for direct ingestion. Mid-project adjustments and follow-on iterations are supported.

Get started

Tell us what training data you need

Tell us the languages, speech type, speakers, recording setup, transcript format, and metadata you need. We return within 48 hours with an initial workflow and data plan.

10,000+
contributors in our recruitment network
50+
languages and dialects recruited for
QA
workflows on every project
48h
target response for project briefs