Next we will use these principles to determine the effect on F1, F2, F3, etc., of constrictions at varying points in the acoustic tube formed by the vocal tract. Remember the first three standing waves in the ideal half-open acoustic tube: these standing waves constitute F1, F2, and F3 in the model, and their frequencies are related as 1:3:5.
Consider the locations of the velocity nodes and antinodes of these three standing waves in a uniform acoustic tube (the nodes and antinodes are numbered uniquely for later convenience of reference):
The boundary conditions, of zero velocity fluctuation at the closed end, and of zero pressure fluctuation at the open end, are signified by locating a node and antinode at the respective ends of the tube. Each higher-frequency standing wave has one additional node-antinode pair equally spaced along the length of the tube.
I will repeatedly make the analogy from this acoustic tube to the vocal tract. If the vocal tract in the position of [] is taken as a uniform acoustic tube, then the lips are at the open end, the glottis is at the closed end, the alveolar ridge is between A2 and N3, the palate or front velum is located around N3, and A3 is around the upper pharynx.
Consider first a simple case: a constriction at the open end of the tube. The result of this constriction, which is at an antinode for each of the three resonances, is the lowering of all three frequencies. Lip-rounding, and the movement towards labial closure that is associated with transitions into labial consonants, creates a localized constriction at the open end of the vocal tract. This predicts the well-established acoustic effect: In transitions from vowels to labial consonants, all the formants typically fall in frequency.
What if the acoustic tube is constricted more globally, so that the degree of constriction is not local to just the opening (lips), for example, but gradually increases along the length of the tube? The effect on the higher wavelength standing waves (F2, F3, F4,...) will have a relatively null, washed-out effect: the distributed, gradual constriction lowers the resonance frequencies by constricting at antinodes, and also raises them by constricting at nodes. These opposing effects largely cancel.
This fact is quite general: if the constriction is not local to one one node or antinode, then it will wash out. One can consider the overall Rayleigh's-rule effect as the sum of constrictions:
Relation 1: The frequency of a standing wave is proportional to the sum of constrictions positively weighted as they are closer to a node, and negatively weighted as they are closer to an antinode.2.8
Non-local constrictions for a given standing wave will add to both the node and antinode sums, and thereby cancel. Thus the higher formants will not be significantly affected by constrictions that are relatively long (that is, distributed along the length of the vocal tract). For this reason, for example, F3 is less well correlated with vowel quality than F1 or F2: tongue-body constrictions that affect vowel quality are not particularly localized. Thus theories of the exact location and degree of constriction for vowels seem to be overspecified (cf., Woods' theory, discussed in the next chapter). A gradually increasing constriction along the length of the tube will therefore primarily affect the 1/4-wavelength standing wave, namely, F1, because it has no node within the tube by which the effect would be washed out. Thus the prediction is that the frequency of the 1/4-wavelength resonance, namely F1, will fall due to a constriction at its antinode. The other resonances will be much less affected since the effects of simultaneous constriction at both nodes and antinodes which result from a non-localized constriction will cancel each other out.
Conversely, if there is a gradual opening of the vocal tract, which reaches maximum opening at the antinode at the open end, the frequency of the 1/4 wavelength wave will rise, and the other resonances will be relatively unaffected. The jaw, of course, has precisely this effect: as the jaw opens and closes, the acoustic tube formed by the the vocal tract opens like a horn, and closes like a bottle. Note that jaw opening does not merely open the front of the mouth, it also moves the tongue body back into the pharynx, because it is hinged on a point above and behind the curved tube formed by the vocal tract. Thus the result of jaw opening is that the vocal tract opens at the open end and closes at the closed end, resulting in an extremely gradual, articulatorily distributed effect, even more than would result solely from front-of-mouth opening. This explains what may be the main fact of acoustic phonetics: as the mouth opens and closes, F1 rises and falls in frequency. F1 measures vocal-tract openness.
Let us now consider the effects of constrictions which are rather more localized along the length of the tube. If there is a constriction at the medial antinode of the second resonance, i.e., at 1/3 of the distance from the closed end to the open end, the second resonance frequency will fall. If the constriction occurs at the node of the second resonance, namely 2/3 of the distance from the closed end, the second formant frequency will rise. These predictions are born out by observing of the association of F2 frequency with tongue body frontness. The location of the tongue body is roughly between 1/3 and 2/3 of the way from glottis to lips.
If the tongue body is closer to A3 than N3, forming a greater constriction at the antinode location than at the node location, then F2 will be lower in frequency than the uniform-open-tube F2 frequency. In other words the tongue will be relatively back, making the vowel articulatorily back, and F2 will fall.
Conversely, consider how the constriction might be greater at the node than the antinode. The tongue body would be fronted towards the point 2/3 of the way from glottis to lips, making a relatively greater constriction at node than at antinode, thereby raising this resonance frequency, namely F2. Indeed, the 2/3 location is roughly at the front velum or palate, and front vowels as a result have a relatively high F2 frequency.
We have thus explained the second fundamental fact of acoustic phonetics: vowel frontness is precisely correlated with F2 frequency, where fronter vowels have higher F2 (closer to the node N3) and backer vowels, closer to antinode A3, have a lower F2. F2, to a first approximation, measures tongue-body position relative to A3 and N3, that is, within the middle third of the vocal tract.
We have here shown how to derive the frequencies of the first two formants directly from considerations of the shape of the acoustic tube. The main acoustic cues to vowel quality are direct correlates of the degree of mouth-opening and of the position of the tongue body relative to the node and antinode of the second resonance, that is, of the degree of tongue-body frontness.
These results show that F1 and F2 directly reflect articulatory configurations, and that the phonological and auditory dimensions of vowel space -- height and frontness -- are directly related to the two formant frequencies and to the articulatory configurations that they reflect. The acoustic measurements of F1 and F2 that form the body of this thesis are to be understood in the light of these results. These measurements are not merely acoustic, or articulatorily and auditorily uninterpretable features derived from the signal. These resonance frequencies are direct reflections of articulatory mouth-opening and tongue-body frontness.
These rather global articulatory properties of the shape of the vocal tract are themselves the fundamental dimensions of variation of vowel quality. Thus auditory vowel height and frontness, F1 and F2, and articulatory degree of mouth-opening and of tongue-body frontness are three physically interconvertible representations of phonetic vowel quality. For this reason, F1-F2 space is itself an excellent representation of vowel quality (given a fixed vocal tract length).
This argument justifies the methods of this thesis, which extensively uses and interprets F1-F2 measurements in terms of height and frontness. Since such heavy reliance is put on this argument, we will now consider how the theory it is based on performs with respect to the remaining main facts about the relationship between formants and articulation.