Comparing the single-word intelligibility of two speech synthesizers for small computers

Cochran, Paula Sue, Curry School of Education, University of Virginia
Bull, Glen, Curry School of Education, University of Virginia
Deese, James, Department of Psychology, University of Virginia
Short, Jerry, Curry School of Education, University of Virginia
Stoudt, Ralph, University of Virginia
Theodoridis, George, Department of Biomedical Engineering, University of Virginia

Previous research on the intelligibility of synthesized speech has placed emphasis on the segmental intelligibility (rather than word or sentence intelligibility) of expensive ana sophisticated synthesis systems. There is a need for more information about the intelligibility of low-to-moderately priced speech synthesizers because they are the most likely to be widely purchased for clinical and educational use.

The purpose of the present study was to compare the word intelligibility of two such synthesizers for small computers, the Votrax Personal Speech System (PSS) and the Echo GP (General Purpose). A multiple-choice word identification task was used in a two-part study in which 48 young adults served as listeners. Groups of subjects in Part I completed one trial listening to taped natural speech followed by one trial with each synthesizer. subjects in Part II listened to the taped human speech followed by two trials with the same synthesizer.

Under the quiet listening conditions used for this study; taped human speech was 30% more intelligible than the Votrax PSS, and 53% more intelligible than the Echo GP. A statistically significant difference in word intelligibility was observed between the synthesizers, with the Votrax PSS being 18% more intelligible. Listeners who heard human speech f6llowed by two different synthesizers performed comparably to those who heard the more likely clinical combination of human speech followed by just one synthesizer.

The observed difference between these speech synthesizers is likely to be most noticeable in clinical applications in which other contextual cues are minimal, or in which listeners are unlikely or unable to take advantage of such cues. In considering the factors bearing on the purchase of speech synthesizers for such applications, clinicians are encouraged to increase the priority they give to intelligibility.

Note: Abstract extracted from PDF file via OCR.

PHD (Doctor of Philosophy)
Speech, Intelligibility of, Word, Intelligibility of, Votrax PSS (Synthesizer), Echo GP (Synthesizer)
All rights reserved (no additional license for public reuse)
Issued Date: