The length of a uniform acoustic tube constitutes 1/4, 3/4, and 5/4 of the wavelength of the first three standing waves. If this length is L, then the wavelengths of the three standing waves are L*4, L*4/3,and L*4/5. Since frequency (f) is inversely proportional to wavelength (λ)
where c is the speed of sound, the three resonance frequencies are in the relation 1:3:5. It is for this reason that the neutral vowel, [ə], which has an articulatory configuration that approximates a uniform tube, has F1, F2, F3 formant frequencies of, for example, 500, 1500, 2500Hz.2.6
If the formant structure of the uniform tube is also the average formant structure of the vocal tract, then we should find not just that [ə] has this pattern, but also that the means of F1, F2, and F3 should be in a 1:3:5 relation, and that the true mean of F1 can be estimated by dividing the mean of F2 by 3, or dividing the mean of F3 by 5.
A small experiment was carried out to test this hypothesis. The data used was a database constructed by an Introduction to Phonetics class. 11 students individually produced 15 tokens each of the 10 vowels /i:/ (as in beat), /i/ (as in bit), /e:/ (as in bait), /e/ (as in bet), /æ/ (as in bat), /ɑ/ (as in pot), /ɔ/ (as in bought), /u:/ (as in boot), /ʌ/ (as in but), /ʊ/ (as in put), /o:/ (as in boat), in the environment: ``Say {b,d,g}_t again'' nearly balanced across high (n=4), mid (n=4), and low (n=3), front (n=6) and back (n=5); they also measured the first three formant frequencies for each token. For each student, I estimated the ``true'' mean F1 in the three ways mentioned: sample-mean F1, (sample-mean F2)/3, and (sample-mean F3)/5. To see how close these estimates are, I calculated the standard deviation of the three estimates for each speaker. These are given in the following sorted list, rounded to the nearest Hertz:
4 12 13 17 28 28 29 35 36 51 52 |
The results have a median of 28 Hz, and a range of 4 to 52 Hz. Thus the three estimates of neutral F1 are generally within about 30 Hz of their mean; this is on the order of the hand measurement error for a single token (cf., Labov, Yaeger, and Steiner, 1972:29,32, and Vol. II, Figures, p. 9.) I conclude that the frequency relations of F1, F2, and F3 are fairly close to the 1:3:5 ratios that are predicted by a model in which the formant structure of the average vowel is identified with that of the uniform, lossless acoustic tube. We may reasonably suppose that this average is the norm, and that actual vowels may be thought of as deviating from it in various directions to various degrees.