Open Voice Project

We're building an open dataset of tabletop RPG session recordings from real gaming groups on Discord. Every participant consents before recording. Each speaker is a separate track. Participants choose their own license restrictions. Released for speech research, transcription, and AI. See how it works.

Add bot to your server View source

Buy me a coffee

The gap

No open, consent-based TTRPG audio dataset exists

Tabletop RPG sessions produce some of the richest multi-speaker conversation in existence — character voices, narration, cross-talk, improvisation. Anyone building speech recognition, transcription tools, or conversational AI has no good source for this kind of audio.

Dataset	Audio	Per-speaker	Consent	Open
CRD3 (Critical Role)	Text only	—	Scraped	Yes
Playing with Voices	YouTube links	Mixed	No	No
FIREBALL	Text only	—	Yes	CC BY 4.0
Open Voice Project	Raw PCM	Per-speaker	Explicit	CC BY-SA 4.0 +

Dataset goals

What makes this different

Per-speaker tracks

Record each participant as a separate audio file. No diarization needed.

Consent-first collection

Every participant explicitly agrees before any audio is captured.

Multi-system coverage

D&D, Pathfinder, Call of Cthulhu, and more. Not limited to one show or system.

Character voice acting

Players alter their voices for characters — a uniquely challenging ASR benchmark.

How it works

Record, consent, contribute

A Discord bot joins your voice channel, collects consent from every participant, and records each speaker as a separate lossless audio track.

GM runs /record in a text channel. Bot detects voice participants and posts a consent prompt.

Each player clicks Accept or Decline. Recording starts once everyone responds.

After accepting, players can toggle "No LLM Training" or "No Public Release." Defaults to fully open.

Bot joins voice and announces "Recording has begun." Each speaker is captured as a separate stream.

GM runs /stop. Audio is pseudonymized and uploaded. Bot announces "Recording complete."

Participants can review transcripts, flag private info, correct lines, and manage consent anytime on the web portal.

Add to your Discord server

Your data, your rules

Participant portal

Every participant can sign in with Discord and manage their recordings. You stay in control of your voice data — before, during, and after publication.

Under construction — coming soon.

Review transcripts

Read through your session transcripts with per-line audio playback. See exactly what was captured.

Flag private info

Spot a phone number or address in pre-game chatter? One click to flag it. Flagged lines are excluded from the published dataset.

Correct transcriptions

Fix what the speech recognition got wrong. Every correction improves the dataset — and produces free ASR training data.

Change permissions

Toggle "No LLM Training" or "No Public Release" per session, anytime. Upgrade or restrict — your choice.

Download your data

Export all your sessions, transcripts, and audio files. Full GDPR data portability.

Withdraw consent

Changed your mind? Withdraw consent for any session. If it hasn't been published yet, your audio is permanently deleted from storage.

What gets collected

Session bundles

Each recording session produces a structured bundle uploaded to secure storage. Discord identities are pseudonymized — the public dataset contains no usernames or IDs.

Audio

Per-speaker raw PCM, 48kHz 16-bit mono

Metadata

Game system, duration, participant count

Consent

Pseudonymized records with two-flag license restrictions

Status

Early collection

We're recording sessions and validating the pipeline. The dataset will be published on HuggingFace once we have enough validated contributions. Participants choose their own license restrictions per session.

Why I'm building this

I need this dataset too

I'm building Session Helper, an open source, self-hostable GM assistant that transcribes sessions, generates notes, and maintains a living campaign wiki. This dataset is how I'm training and evaluating the transcription pipeline — and I want it to be useful to everyone else working on speech and language tools too.