SSSHP Contents | Labs

 MATTINGLY 1974, p. 2455 
Go to Page | Contents Historical Development | Bibl. | Page- | Page+
 

which according to Dudley and Tarnoczy (1950) had variable pitch and sang 'God Save the Queen'.

But the next real step forward was Dudley's Voder (Dudley et al. 1939), an offshoot of his Vocoder, described earlier. The ten filters of the Vocoder synthesizer were widened to cover the range 0-7500 Hz and a set of manual controls was supplied. The output amplitudes of the filters were controlled from a keyboard; a wrist bar selected buzz or hiss excitation and a foot-pedal controlled Fo. There were also special keys to generate automatically the sequence of closure and release required for stops. A year or more of training was required before an operator could produce intelligible speech, and each utterance had to be carefully rehearsed. The Voder was demonstrated successfully at the 1939 New York World's Fair and the 1940 San Francisco World's Fair.

There were two important differences between Dudley's approach to speech synthesis and von Kempelen's. First, Dudley's Voder was an electrical rather than a mechanical simulation, so that the acoustic properties of synthesizer components were reasonably predictable and design changes could be readily made. Second, the Voder simulated acoustic properties of speech whereas von Kempelen's simulated articulatory properties as well. Dudley's model made it easier to improve the rendition of particular speech sounds, but made a weaker claim about the nature of speech. However, Dudley's system had one important feature in common with von Kempelen's: a human operator was used, and the rules for synthesis were part of his skill.

The human operator disappeared from speech synthesis as an indirect result of R. K. Potter's invention of the sound spectrograph during World War II (Koenig et al. 1946). After the War, the spectrograph opened the way to extensive research in acoustic phonetics because it made it easy to observe the correspondence between speech sounds and events in the acoustic spectrum, notably formant movements. The spectrograph also suggested a new way of synthesizing speech: 'playing back' a spectrogram. Potter himself built a playback synthesizer (Young 1948); Cooper (1950) developed a research version, the Pattern Playback, which is still in use at Haskins Laboratories. In the Pattern Playback, an optical representation of an excitation spectrum with 50 harmonics and Fo at 120 Hz is shaped by a spectrographic pattern painted on a moving, transparent acetate belt; and this optical representation is then converted to an acoustic signal. Thus the synthesis of an utterance is not a transient performance but is controlled by a pre-planned pattern, and can be repeated. Moreover, the close correspondence between the output of the analyzing tool (the spectrograph) and the input to the synthesizing tool (the Playback) is convenient experimentally and of great value conceptually.

The Haskins investigators used the Playback to study the psychology of speech perception and to accumulate a body of knowledge about the 'speech cues' (Liberman et al. 1967). Experienced users of the Playback, for example the late Pierre Delattre, could readily paint intelligible utterances; like the operators of von Kemplen's
 

Go to Page | Contents Historical Development | Bibl. | Page- | Page+

 MATTINGLY 1974, p. 2455 
SSSHP Contents | Labs
Smithsonian Speech Synthesis History Project
Archives Center
NATIONAL MUSEUM OF AMERICAN HISTORY
Smithsonian Institution - Washington, D.C. 20560