EmoPilot MOS Listening Set

Emotion cue examples for zero-shot TTS

Five target-emotion cases selected for subjective MOS evaluation. Each case includes the paired target audio reference, retrieved emotion prompts, and generated speech from the evaluated systems.

Evaluation protocol

Positive 1-5 MOS

R-MOS

Score each retrieved prompt by emotional similarity to the paired GT reference. Ignore lexical mismatch.

G-MOS

Score each generated speech by emotional similarity to the paired GT reference and fit to the transcript.