FIRST Soap Vox Lecture of the new year | Wed 19 September | Alice Turk

at 18:00 in Appleton Tower

RSVP on facebook

Timing in talking: What is it used for, and how is it controlled?

Abstract: Timing is an integral part of every aspect of speech production: of individual movements of the rib cage, tongue, jaw, lips, velum and laryngeal structures, of their coordinated muscular activity, and of the speech sounds they produce. An understanding of speech production therefore requires an understanding of timing: 1) what it is used for, and 2) how it is controlled. In the first part of this talk, I review our current understanding of what speakers use timing for, and how this understanding was acquired. I propose that one of the main uses of speech timing is to make utterances easier to recognize: it is used to signal individual speech sounds (e.g. did vs. dad) [1], and also to signal, and compensate for, the relative predictability of syllables and words due to their context and frequency of use (e.g. [2], [3], [4], [5]). I propose that this recognition goal is balanced against other goals, such as the need to speak quickly, or in rhythm, to yield surface sound durations in speech. I highlight the important role of prosodic structure for speech timing: Prosodic structure serves as the interface between language and speech ([6],[7],[8]]), and controls acoustic saliency so that it compensates for relative (un)predictability ([3],[4],[5]). In the second half, I focus on two different views of how timing is controlled, i.e. with and without a domain-general timekeeping mechanism. Theories such as DIVA [9], based on VITE ([10]), and many Optimal Control Theory approaches (e.g. [11]) assume a general timekeeping mechanism, whereas Articulatory Phonology/Task Dynamics ([12-15]) suggest mechanisms for achieving surface timing patterns without a domain-general timekeeper. I present timing phenomena that occur in both speech and non-speech, showing how they can be explained within each type of framework. I finish by presenting evidence that may be difficult to explain without a domain-general timekeeping mechanism. This evidence includes greater timing variability for longer duration intervals compared to shorter duration intervals (e.g. phrase-final segments vs. phrase-medial segments, [16]), patterns of differential timing variability for movement onsets vs. target attainment ([17]), and data suggesting a constraint on maximum syllable durations for phonemically short vowels in Northern Finnish [18].
1.Peterson, G., & Lehiste, I. (1960). Duration of syllable nuclei in English. JASA 32(6), 693-703.
2.Lieberman, P. (1963). Some effects of semantic and grammatical context on the production and perception of speech. Language and Speech, 6(3), 172-187.
3.Aylett, M. 2000.Ph.D.thesis, University of Edinburgh.
4.Aylett, M., & Turk, A. (2004). Lang. and Speech, 47, 31-56.
5.Turk, A. (2010). Does prosodic constituency signal relative predictability? A Smooth Signal Redundancy hypothesis. Journal of Laboratory Phonology, 1, 227-262.
6.Selkirk, E. O. (1978). On prosodic structure and its relation to syntactic structure. In T. Fretheim (Ed.), Nordic Prosody II (pp. 111-140). Trondheim: TAPIR.
7.Shattuck-Hufnagel, S., & Turk, A. (1996). A prosody tutorial for investigators of auditory sentence processing. Journal of Psycholinguistic Research, 25(2), 193-247.
8.Keating, P., & Shattuck-Hufnagel, S. (2002). A prosodic view of word form encoding for speech production. UCLA Working Papers in Phonetics, 101, 112-156.
9.Guenther, F. H. (1995). Speech Sound Acquisition, Coarticulation, and Rate Effects in a Neural-Network Model of Speech Production. Psychological Review, 102(3), 594-621.
10.Bullock, D., & Grossberg, S. (1988). Neural Dynamics of Planned Arm Movements – Emergent Invariants and Speed Accuracy Properties during Trajectory Formation. Psychological Review, 95(1), 49-90.
11.Todorov, E., & Jordan, M. I. (2002). Optimal feedback control as a theory of motor coordination. Nature Neuroscience, 5(11), 1226-1235.
12.Browman, C. P., & Goldstein, L. (1985). Dynamic modeling of phonetic: structure. In V. A. Fromkin (Ed.), Phonetic linguistics (pp. 35-53). New York: Academic Press.
13.Saltzman, E. L., & Munhall, K. (1989). A dynamical approach to gestural patterning in speech production. Ecological Psychology, 1(4), 333-382.
14.Byrd, D., & Saltzman, E. (2003). The elastic phrase: modeling the dynamics of boundary-adjacent lengthening. Journal of Phonetics, 31(2), 149-180.
15.Saltzman, E., Nam, H., Krivokapic, J., & Goldstein, L. (2008). A task-dynamic toolkit for modeling the effects of prosodic structure on articulation. Paper presented at Speech Prosody 2008, Campinas, Brazil.
16.Byrd, D., & Saltzman, E. (1998). Intragestural dynamics of multiple prosodic boundaries. JPhon, 26(2), 173-199.
17.Perkell, J. S., & Matthies, M. L. (1992). Temporal measures of anticipatory labial coarticulation for the vowel /u/ – within-subject and cross-subject variability. JASA 91(5), 2911-2925.
18.Nakai, S., Turk, A., Suomi, K., Granlund, S.C., Ylitalo, R. & Kunnari, S. (in press). Quantity and constraints on the temporal implementation of phrasal prosody in Northern Finnish. Journal of Phonetics.


Entry is £1 and FREE to active members. Membership can be purchased on our EUSA profile ( or otherwise at that night.

The talk will start at 6:00 p.m. and last about 1 hour. It will be followed by a Q&A session (about half an hour). We will then go to a pub for food and drink with the speaker.

Our talks are public lectures open to all, regardless of whether you are a student or not or what or where you study if a student. We aim for all of our events to be accessible to all; please feel free to contact us beforehand if you require assistance or further information.

Leave a Reply