
Montreal Forced Aligner (MFA) Alignment for Speech Corpora (Male + Female)
Upwork
Remote
•8 hours ago
•No application
About
Overview We are preparing two English audiobook corpora for research purposes: Male voice — 12 hours total (native 44.1 kHz) Female voice — ~5 hours total (native 44.1 kHz) We require professional alignment using the Montreal Forced Aligner (MFA v3+) to produce high-quality TextGrids and training-ready files. Scope Male corpus: full 12 hrs aligned. Female corpus: full 5 hrs aligned. Align at word + phone level using MFA (long TextGrid format). Export .lab and .ctm files. Provide QC metrics and process logs. Maintain strict folder schema (MFA_Kit_V7 with scripts/templates will be provided). Deliverables (per speaker) MFA TextGrids (word + phone tiers). .lab and .ctm exports with durations. Combined lexicon (CMUdict + overrides provided). QC report (JSON/CSV) with coverage & OOV stats. Process log (process_log.md) + MFA version info. Updated manifest CSV (manifest_base_(speaker).csv). Requirements Proficiency with Montreal Forced Aligner (MFA v3+). Experience aligning 5–12 hrs of clean audiobook narration. Familiar with ARPAbet/CMUdict lexicons, G2P for OOVs. Ability to meet accuracy thresholds: 99% phone coverage OOV 0.3% (with lexicon overrides updated) Median boundary error ≤12 ms (validated on plosives/stops) Must deliver in our provided folder structure. Assets Provided MFA Kit V7 (scripts, lexicon overrides, manifests, QC tools). Full male (12 hrs) and female (5 hrs) audiobook corpora at 44.1 kHz. Curation protocol (if needed for trimming/verification). Milestones Pilot (fixed price): Align 30 minutes per speaker (~200 utterances). Provide TextGrids, .lab, .ctm, and QC report. Approval required before proceeding. Full Run: Female: full 5 hrs. Male: full 12 hrs. Deliver complete aligned pack with QC. Confidentiality & IP All deliverables are Work-for-Hire. Ownership transfers fully to client. NDA required. Freelancer cannot reuse or retain datasets, lexicons, or outputs.