Scripted monologues, dialect-tagged
- MRMaria · Madrid · 32Done
- JPJavier · Sevilla · 41Recording
- ALAna · Bilbao · 27Queued
Tell us the languages, accents, speaker profiles, recording setup, speech type, transcript format, and metadata you need. We recruit, record, QA, transcribe, and deliver the dataset in the structure your team needs.
Most teams do not just need more audio. They need the right speakers, the right speech, and the right structure.
Spirelight builds that dataset.
We source contributors by language, dialect, age, gender, region, device, or other project criteria. Speakers record through browser-based tools with prompts, consent, audio checks, and project guidelines built into the workflow.
Review progress during production and receive validated batches with audio, transcripts, metadata, and manifests. Your team can test early batches and adjust the collection before the full dataset is finished.
Need Danish conversations, German call-center speech, French dialect coverage, wake words, commands, or audio plus video recordings? We design the collection and annotation workflow around your data spec, from speaker recruitment to final delivery.
We collect monologues, dialogues, wake words, commands, scripted prompts, roleplays, and natural conversations with speakers matched to your language and profile requirements.
We deliver machine-assisted or human-validated transcripts with timestamps, speaker labels, domain terminology, and annotation rules matched to your model training needs.
We package audio, transcripts, metadata, consent references, QA notes, and manifests in the format your engineering team needs.
Four reasons teams use Spirelight for custom speech data collection.
We can run the session on your hardware so the data matches the production room your product ships into.
Define the speakers you need by language, dialect, region, age, gender, device, environment, or other project criteria.
Contributor recruitment across 50+ languages and 30+ markets.
We can run remote, on-site, or studio-style sessions using defined microphones, devices, rooms, scripts, and acoustic requirements.
From single-speaker sessions to multi-day collection projects.
Your team can test early deliveries, identify gaps, and update speaker targets, prompts, or guidelines before the full dataset is complete.
Mid-project adjustments without restarting production.
We regularly recruit and review speakers in markets where off-the-shelf datasets are limited, including Nordic languages, regional dialects, and smaller European language varieties.
Recruiters and reviewers across Denmark, Sweden, Norway, Finland, and Iceland.
Your speech data partner for
fine-tuning and language expansion.
We combine contributor recruitment, recording workflows, transcription, QA, and delivery so your team can expand into new languages without building local operations from scratch.
The workflow is similar across projects, but the speakers, prompts, recording conditions, annotations, and deliverables change with each use case.
Collect commands, activation phrases, device instructions, and short utterances across languages, accents, microphones, and environments.
Build language, accent, and dialect coverage for speech recognition, synthetic voice, and speech evaluation datasets.
Collect conversations, roleplays, customer service scenarios, emotional speech, and domain-specific interactions for more natural voice systems.
Spirelight combines commercial project design, recruitment operations, platform engineering, transcription workflows, QA, and delivery management in one team.
Andreas works with clients to turn model requirements into concrete data collection projects.
He defines the project scope, speaker targets, recruitment approach, and delivery expectations before production starts.
Emil supports operations, documentation, compliance coordination, and project delivery.
He helps structure the process so recruitment, consent, production, and handoff stay aligned.
Gustav leads the platform architecture behind Spirelight.
He builds the systems used to manage recording, transcription, QA, metadata, contributor workflows, and dataset delivery.
Joyi manages project execution across contributors, reviewers, and delivery teams.
She keeps production moving, follows up on daily progress, and helps ensure each project meets its agreed requirements.
Mateo coordinates contributors, recording workflows, and production tasks.
He helps translate project requirements into daily execution and keeps the different parts of the workflow aligned.
Pekka supports recruitment strategy and contributor operations.
He helps source speakers for projects with specific language, dialect, regional, or profile requirements.
Tell us the languages, speech type, speakers, recording setup, transcript format, and metadata you need. We return within 48 hours with an initial workflow and data plan.
We collect targeted wake-word and command speech with the speakers, accents, and acoustic environments your model needs to handle in production.
Where this fits: automotive, smart home, appliances, wearables, voice assistants.
Off-the-shelf data leaves most of the world blank. We close the gap with native speakers vetted by region, not just by language code.
Where this fits: companies launching voice products in new languages, accents, or dialects.
Real human speech carries far more than words. We deliver datasets where the affective signal is captured, annotated, and structured for training.
Where this fits: voice agents, call centers, automotive safety, health tech, gaming, and any product that needs natural human-machine interaction.