- A formant is a resonance frequency of the vocal
- Chiba
& Kajiyama state that ``It was our original goal to write this book
without making use of mathematical equations'' (1941:i), though they
do not. For accessibility's sake, this goal remains a worthy one.
- Air particles are to be thought of as
infinitesimal volumes of air rather than as atoms per se. Air volumes
have a pressure, but atoms do not.
- By
velocity waveform, I mean the waveform which describes the velocity of
air along the length of the tube.
- This assumes
a vocal tract 17.1 cm in length, and a speed of sound, c, of 343
meters/second -- that is, in a nearly average male vocal tract in
under normal atmospheric conditions
- Though see Ohala (1985) for its use in
explaining the feature ``flat.''
- This relation is expressed mathematically in terms
of an area function of the vocal tract (which specifies how wide the
tube is at each point relative to its average width) and a weighting
function which weights constrictions positively at nodes and
negatively at antinodes. The frequency change may be found by
multiplying the values of the area function by the corresponding
values of the weighting function at each point along the length of the
tube, and adding (integrating) the results. The sum of positively
weighted constrictions is balanced against the sum of negatively
weighted constrictions; the larger one determines the direction of
change. If the constrictions are positively weighted, and the
widenings (negative constrictions!) are negatively weighted, the
effect is an even more positive one, and the frequency rises a great
- For
example, V. Zue makes this point in spectrogram-reading instructions.
- Thanks to Dave Graff for referring me to this
- Dental and retroflex sounds in these
contexts are not part of my native phonological inventory; I am a
native Californian English speaker raised in Thailand (1-4), Okinawa
(4-10), and California (10-23), and my parents are also raised on the
West Coast. However, two years of intensive Hindi study
in the U.S. and in New Delhi provided some training and experience in
producing them for the purpose of communication.
- The formant tracker used is part of the waves+
signal-processing package. Default parameters accurately tracked my
voice: The signal is preemphasized using a preemphasis coefficient of
0.7. Then it is divided into overlapping frames, 100 per second, with
each frame 49ms long, and the frames are windowed with a cosine7#7
windowing function. Then an autocorrelation algorithm is used to fit
12 LPC coefficients, from which spectral peaks and bandwidths are
calculated. Finally, a dynamic-programming algorithm is used to find
the globally optimum mapping between the LPC peaks and 4 formants,
where optimality is measured by narrowness of bandwidths and
continuity of formants over time.
- If it is indeed a straight line, then
slope = (max(onsetF2)-min(onsetF2))/(max(nucleusF2)-min(nucleusF2)).
- For example, see Figure 3,
Pettersson & Wood (1987), which displays the nodes and antinodes of
F1 through F4 in the vocal tract, adapted from Chiba & Kajiyama
- Thanks to David Graff for this formulation.
- Ladefoged (1990:401) also suggests that the vowel
features (High, Low, and Back, for him) are acoustic features.
- The issues of temporal
and static structure obviously are interrelated, since the features
available at any point depend on the location within the syllable.
Consonant features and vowel features are not identical; the set
of features distinguishing among nuclei may well be different from
those distinguishing among glides; certain features (e.g., place of
articulation features) may be available only in certain parts of the
syllable onset and not in others. And so on.
- Apparently the
answer is both yes and no. Peterson
& Barney's (1952) study of General American deserves mention in this
context since the data from that study is continually being
recirculated (e.g., Watrous 1991). They collapse together speakers of
several dialects differing, for example, in whether they have or don't
have the phonological distinction between /ɑ/ and /ɔ/ as
in cot and caught. The rather anomalous overlap between
measurements of these classes (Figure 8) even in highly-monitored
speech is therefore not very surprising.
- I lack distinctive categories for /ɔ, a:,
'ærV/ as in THOUGHT, PALM, & marry.
- The layout in this listing is designed
so that sets that are not distinguished in one dialect or another are
relatively close to one another, where possible. The sequence of
columns may be described as: front-r-glides, front-glides, other front
vowels, other vowels, back-glides, other-r-glides.
Yaeger, & Steiner (1972, Chapter 6 and Appendix A).
- ...
- Of course, stating a merger using lexical sets does not
show what the resulting phonetic forms are.
- My monitored
pronunciations of these are [lei, le;
mo, mo, sii, si].
- Bailey (1985:162) makes the same point.
- Thus a secretary recently heard answering
the telephone as ``Peter Science'' was reinterpreted correctly as
saying ``Computer Science'' after normalizing for utterance-initial
syllable deletion, /uw/-fronting, and the physical location of the
speech event. Or for example in some speech recorded in Southampton,
England, an utterance-initial syllable was constituted by a single
pitch pulse, 40ms before the onset of the audible part of the
utterance. Such observations may be typical of naturally spoken
- As mentioned above,
these sound classes often have phonetic inglides in the Southern U.S.
-- cf. the Alabama chapter -- but not in the basically Northern
dialect of Reference American.
- Phonetically, these may be
written [phIt], [pht],
[pht], [phUt], respectively.
- [biy], [biyn]; [beI],
[beIn]; [buw], [buwn];
[boUn], respectively.
- See also Peterson
and Lehiste (1960) on ``intrinsic vowel length''.
- Los
Angeles Chicano, Jamaica Creole, and Wisconsin/Minnesota, to cite
three dialects, often have monophthongs for the ``long'' vowels. Cf.
also the duration measurements in Appendix 3.
- [b], [boI],
[bo], respectively.
- These may be written
phonetically as [hiyd], [hd],
[hiy], [h], respectively.
- These may be transcribed as
[le], [lei], [mo],
[mo], [fi], [fii],
[tu], [du].
- For me, these all end in [].
- flour and flower may
be a poor example, since they are historically derived from the same
word, as in ``the flower of the wheat.''
- [siy], [se],
[soU], [suw], respectively.
- [bier, be,
bo, b].
- For example, as pointed out in Chomsky &
Halle (1968), and Janda (1988), German on-glides /w,y/ become
fricatives, but off-glides do not. Thus /au/, /aI/ don't become [av],
- But see also, inter alia, Bloomfield (1934),
Whorf (1943), Swadesh (1947), Pike (1947).
- The lexical sets with no place
in this table are six: CLOTH, which goes with either THOUGHT or LOT,
depending on dialect; BATH, which goes with either TRAP or PALM,
depending on dialect; NORTH, which goes with FORCE in most dialects;
and HAPPY, LETTER, COMMA, which are lexically unstressed and therefore
excluded from T&B's consideration.
- These symbols are constructed from T&B's table. /
w/ is often also written as /ow/.
- ...
- However, /oy/ is raised to mid or higher in many
dialects including my own; I therefore analyse it, below, in Reference
American, as containing a mid-back nucleus.
- For
an early statement, Kenyon and Knott's section, Variations, §92,
p. xxxviii, in the Pronouncing Dictionary which has served as the data
for much of the generative phonological treatment of English. For
recent discussion, cf. Halle & Mohanan 1985, Section 2.
- In many Southern U.S. dialects in various contexts,
/y/ has monophthongized, so the high-front glide is absent. In
Reference American, this glide is present.
- These may be pronounced as [pUt, pt, pæt,
pt, pt, pUt].
- cf. the split within the word horror ([hor]).
- This
merger did not occur in certain Celtic dialects, where /r/ was not a continuant and thus remained outside the
Nucleus-Glide positions.
- The exception is the
Los Angeles Chicano community, where Santa Ana found that /r/ behaves
like a consonant in its effects on /-t,-d/ deletion, an unusual and
striking fact that may be attributable to the influence of Spanish
contact with that English dialect.
- In Reference American of
these contain [].
- The
argument here must be restricted to English alone, since there are
languages in which the glide is the only segment which occupies the
- [fis], [fo
- ...soy3.36
- Phonetically, [siy], [seI],
[s], [suw], [soU],
[soI], respectively.
- Bailey (1985) suggests that
English is one of these.
- Thanks
to David Graff and to William Labov for pointing this out to me.
- Part of
the answer may be that languages cannot distinguish [15#15round] back
- However, Krohn (1969) and Wang have
argued that diphthongs are [+high, +low].
- Grammars may incidentally overgenerate,
without being flawed. For example Russian requires the [±voice]
dimension to distinguish /p/ from /b/, and other features for /X/, and
/c/. But //, // are not distinctive, though
[], [] do occur (Halle 1957, 1959). The feature system can
hardly be at fault for this kind of incidental gap in the system.
- The speech pathologist, better known as the father of
Alexander Graham Bell, the inventor of the telephone.
- For a discussion of the history of this model
see Wood (1987).
- Stanley (1967)
objected that using an unspecified value for a feature amounts to
having three values for it. If [height] is taken as a feature, with
values [high height], [0 height] and [low height], then his objection
would apply. However, height is a tier, not a feature, and it
contains features, not values. Only if tiers are considered to be
identical to features is Stanley's objection relevant.
- Two other lexical sets which
go with these are CLOTH, which in the U.S. goes with THOUGHT and BATH,
which in the U.S. goes with TRAP.
- We may locate Wells' lexical sets in
the slots of this structure, thus:
V |
V: |
Vr |
Vy |
Vw |
21#21 |
21#21 |
21#21 |
21#21 |
21#21 |
21#21 |
21#21 |
(marry) |
21#21 |
21#21 |
Not displayed are BATH, CLOTH, FORCE, NURSE, and the unstressed sets
HAPPY, COMMA, LETTER. BATH goes together with TRAP; CLOTH with
with FLEECE (though in the South, HAPPY typically goes with KIT).
NURSE is analyzed separately from this system, as argued below.
- Southern dialects may have [æ] in
SQUARE, but this may be analysed as lowered /er/ rather than as /æ
r/, as discussed in the Alabama chapter. If /ær/ were correct for
these dialects, then so much the better for this analysis, since /æ
r#/ would then fill this gap in the system.
- For the sake of consistency in features, I represent
the front-back dimension with a privative feature, either [front] or
[back]. The choice between these two is discussed below where the use
of the non-traditional feature, [front], instead of the usual feature,
[back], is justified.
- Another perspective on this issue is the following.
Notice that the glide position is used to encode four things: the
three glides /y, w, r/, and length. This requires a front/back
distinction, and a high/non-high distinction, but it appears that
there is no need for a three-height distinction among glides. The
multiple distinctions of height among long vowels may presumably be
dealt with at a lower, phonetic level, where the height of the glide
is phonetically assimilated to that of the nucleus. If the glide
position need not distinguish three underlying degrees of height, then
the [low] feature in glide position may be re-interpreted simply as
non-high. This would make better sense of the three instances where
``non-high'' was used in the preceding two paragraphs.
- r-less dialects retain /r/ in syllable
onsets, which may be specified as [rhotic] underlyingly. Where r-less
dialects have postvocalic ``intrusive'' /r/, as in ``A vodka or
two''[vdkthuw], the
intrusive /r/ must be analysed as non-underlying, since it cannot be
- In most English dialects ``glide'' is also a
phonetically appropriate term for long vowels, since they do generally
have phonetic glides.
- In phonological and phonetic
performance, errors are almost non-existent (Labov 1966), and
dysfluencies or ``false starts'' themselves are well-formed according
to a simple set of rules (Hindle 1983). Limitations of memory pose no
difficulty for native speakers in the production or perception of
sequences of sounds. Thus the flaws of performance data adduced for
syntactic data do not apply with any force to phonological and
phonetic performance. On the contrary, in phonetics and post-lexical,
surface phonology, a similar concern about errors in the data leads to
the opposite conclusions: speech errors are more frequent when speech
is self-conscious.
- Here
[y,w] are as Chomsky & Halle (1968) define them: higher than [i, u].
- Central /⋀/ is not only quite non-peripheral,
it also doesn't share the roundness feature with its counterpart /o:/,
as in Table
- This may be due to the fact that
sounds are generally louder when voiced.
- cf. the
Project on Linguistic Change and Variation, Labov (1980).
- What is ``easy'' varies across
languages. Just as what is physically easy for some is difficult for
others, because of the practice to which their muscles -- or
neuromuscular control systems -- are accustomed, different languages
and dialects may set the default energy-expenditure level of the
various speech articulators at different levels. In this way, the
physical sluggishness of the articulators may effectively vary across
languages. Cf. Sievers (1901), who offered the explanation of
phonological symmetry that there is a different rest position of the
organs in speakers of different languages. We may add to Sievers that
certain muscle movement patterns are highly practiced in one language
and not in another, and are therefore easier, both because of the
resulting strength and endurance of the muscles involved, and the more
redundant, robust, and fully-trained neural control systems behind the
movements involved.
- In an all-pole model, the representation of
a spectrum in terms of center frequency and bandwidth of spectral
peaks, or poles, is interconvertible with several other
representations. Thus formants may be considered as good as any other
analytical representation of the speech spectrum, under the
assumptions of this model.
- The tape recorder was a high-quality (Nakamichi)
cassette recorder; the tape used was Maxell XL-II cassette tape; the
microphone was a broadcast quality Shure 570S lavalier mike; the setting
was an isolated room in a quiet house, with no machine noise.
- ...
- Since a moving-average amplitude contour may be
spiky rather than smooth, some method is necessary for separating
``true'' crossings of the threshold signifying the beginning of the
acoustic vowel, from false crossings due to short-duration amplitude
spikes due to single glottal pulses, transient bursts, etc.
``Sloppy-crossing'' is an algorithm developed for this purpose which
requires the threshold to be crossed for more than a given fraction,
29#29, of a given amount of time, 30#30, before a ``true'' threshold
crossing is considered to have occurred. If these parameters are set
to 29#29=80% and 30#30=.1 seconds, for example, then if the parameter
stays over the threshold for 80% of any .1-second segment, a
threshold-crossing is recorded at the beginning of that segment.
- cf., Labov, Yaeger, and Steiner
1972:29,32, and Vol. II, Figures, p. 9.
- The computer software
used in this project for digitizing, for formant-tracking, for display
of waveforms, spectrograms, and formant-tracks, and for segmentation,
are parts of the waves+ package developed by David Talkin at
AT&T Bell Laboratories, and available commercially from Entropic
Speech, Inc., Washington, D.C.
- This post-processing technique is
described in Talkin (1987). Cf. also Secrest and Doddington (1983), and
Dupree (1984).
- Documented in Becker, et al, 1988.
- Interestingly this is not a positive cue, but
rather the non-application of a regular process, which is in a sense a
negative cue. Another cue to the presence of this ``editing signal''
is phonotactically impermissible syllable endings, e.g., str-.
There are also explicit editing-signal morphemes, such as uh,
um, etc. I speculate
that another editing-signal cue is an apparent stress on closed class
items which shouldn't be stressed, where the appearance of stress
comes from a rapid change in the pitch contour at the boundary of the
- William T. Reynolds.
- Doddington estimates usual human error rate in
making categorical classifications of clear cases at around 4%.
However when highly motivated, the level improves to under 0.5%. In
the current case, the motivation of the coder (me) is undoubtedly
greater than that of a poorly paid, unmotivated experimental subject,
but not as great as the financially highly motivated subjects in
Doddington's study.
- Equally
well-known is a resampling technique known as the jackknife, so-called
because it's good for many purposes. The bootstrap is simpler and
even more useful (though it will not quite allow you to pull yourself
up by your bootstraps).
- A function in the S language
(described in Becker, et al, 1988) to bootstrap resampled statistics
from a given dataset is given here and described below:
bootstrap 34#34 function(data,stat,nResamplings) {
N 34#34 length(data)
result 34#34 vector(mode="numeric",length=nResamplings)
for (i in 1:nResamplings) {
result[i] 34#34 stat(data[round(runif(N,min=-0.5,max=N-0.5))])
This function works thus: Inputs are the sample itself, data;
the function, stat, which generates the statistic (e.g., mean,
sum, variance, etc.); and the desired number of times to resample the
data, nResamplings. The length of the data vector is the
number of observations in the sample, N. The output of the
function, result, is a vector of numbers as long as the number
of resamplings desired (this should be on the order of 200 to 1000).
The substance of the function is a single line inside a loop, which
repeatedly does the resampling and the calculation of the statistic.
The key line works from the inside out, thus:
runif(n,min,max) generates a list of n random floating-point
numbers from a uniform probability distribution between min and max,
thus generating random floating-point numbers between -0.5 and N-0.5.
round(vector) rounds off each element of a list (or vector) of
numbers to integers; in this context it rounds off all the random
numbers to integers between 0 and N-1, inclusive. data[vector]
uses the vector of integers as a list of indexes into the sample data,
and extracts the indexed elements. Thus the random numbers are
interpreted as indexes into the data array. Note that multiple
references to the same index can occur freely, so that if data
was the vector (6, 7, 8, 9), then the expression, data[(1,2,1,3,3,1)], would return the vector, (6, 7, 6, 8, 8,
6). stat(vector) interprets vector as a data sample, and
computes the given statistic using the numbers in that sample. In
this context, then stat() takes the indexed data-points picked by the
random-number generator, and calculates the statistic using them.
Finally, result[i] 34#34 stat(..) sets the i'th element in the
result[] vector to be the value that is returned from the
calculation, stat(..).
This code is not especially fast, and it is elaborated when applied to
multi-dimensional data, but it is effective both for doing the task
itself in the one-dimensional case and for showing how to do (and how
easy it is to do) bootstrap resampling.
- As was Jim
from Chicago, for example.
- Unstressed /ɝ/ becomes
/a/ in Jamaican, thus I might call him ``Roaster'', a nickname that
comes from his avocation, ``roasting'', or moonlighting with his
employer's equipment.
- This information is from the Statistical Yearbook of
Jamaica, 1986.
- LePage (1960),
Cassidy & LePage (1967), Wells (1973), Wells (1982).
- As
discussed in Phonological Preliminaries, page
, Wells'
comparative categories are not the same as the historical word classes
whose mergers, splits, and phonetic changes resulted in the modern
form of the language. These categories are not primarily intended to
show the changes by which Middle English developed into Early Modern
English and into Modern Jamaican, but simply to show the lexical
correspondences that now hold between this dialect and
35#35 The uncertain, possibly unmerged status of the low back
phonemes /, :/, which extensionally correspond to the
lexical sets LOT, THOUGHT, NORTH, is discussed below and in Appendix
- Although avoid is /avaid/ rather than /avwaid/ in Roasta's speech, so merger
has indeed occurred in some cases.
- The offglide
could be [] or [], [o].
Distinguishing these possibilities is a difficult matter, since
[i, uo], and [i, u]
are all ingliding vowels. It may have been equally appropriate if
Cassidy had written these vowels as /i, u/.
- In Phonological Preliminaries, phonological height is
analysed as an autosegmental tier, which may contain a single
privative height feature or none, where [high] and [low] are the two
features available.
- If we may reason from synchronic rules to diachronic
rules, it would appear that the Great Vowel Shift did not proceed
symmetrically in the front and the back. It is indeed a fact that the
two vowels that underwent diphthongization and lowering from earlier
/iy/i:, u:/uw/ to modern /ay, aw/ in many dialects of English did not both fall to low in Jamaican Creole.
- The set of rules stated here could
possibly be simplified if the natural rules of nucleus-glide
differentiation (in height features) and nucleus-glide assimilation
(in backness features) were attributed to general principles, and if
they did not need to be stated explicitly in a grammar of Jamaican
- The lexical sets corresponding to the
slots in Table
are as follows:
V |
V: |
V+{i,u,r} |
- The
numbers of measurements broken down by class and speaker are given in
Appendix 3, along with mean vowel durations.
- A display of single means would show the same
structure, but would not show how significant the relationships
between the means are.
- ...
- Liljencrantz and Lindblom's theory of maximal
dispersion may be criticized in two ways. Labov 1982 has pointed
out that their results are based on phonemic data, not the kind of
phonetic data which would be necessary to support their claims. If
i, a, and u are used in the phonological transcription of most of the
languages of the world, it cannot be inferred that most of the
languages of the world have vowel phonemes with the phonetic qualities
of IPA [i, a, u]. A language might well have, for example,
[i, , u] as the main allophones for
/i, a, u/. Any three-vowel system is likely to be transcribed with
these symbols, no matter what the phonetic targets are. Bessell's
(1991 and forthcoming) studies of the phonetics of vowels in Interior
Salish languages make this quite evident. The fundamental distinction
between phonetics and phonology is ignored when this inference is made
from phonology to phonetics.
L&L's functional theory suggests that sounds should be maximally far
apart, but a principle of minimum effort is also necessary, because
sounds are not always maximally separated. For example, languages
with just two tones often use the minimum degree of phonetic
difference to carry the contrast. Thus there is a balance between
maximal distinction and minimal effort. But this amounts to saying
simply that sounds are more or less distinct, which is vacuous.
Something must be salvaged from L&L's theory in order to use it in
deriving (2) and (3).
- That is, the Reference
American vowels /ay, aw, oy/, or Jamaican /ai, ou, ai/, corresponding
to the lexical sets PRICE, MOUTH, and CHOICE.
- See Harris (1969, 1983).
- The vernacular of
most African American speakers is substantially different, in Northern
U.S. cities, from the vernacular of white speakers in those cities.
African American English in the northern cities has its origins in
Southern U.S. speech and in the Caribbean creoles. It has been
imported via migration to the North, and is relatively homogeneous
throughout the U.S. Thus pen/pin are likely to be
homophonous, /y/ is variably a monophthong before voiced consonants
and in free position, and final consonants are frequently deleted (so
that told, toll, and toe may be homophonous). The de
facto segregation of the black and white communities has resulted in
continuing linguistic divergence. See Labov and Harris (1986) for a
discussion of this situation in Philadelphia. Thus it is necessary to
analyse the white and black dialects separately.
- Nagra produced the highest quality portable
tape-recorders available before digital audio tape.
- The interview was conducted
by Sharon Ash as part of the fieldwork done for the project on
Cross-Dialectal Comprehension with William Labov. I would like to
express again my thanks to her and to the members of CDC project.
- Speakers from other
Northern Cities, such as Detroit and Buffalo, are equally advanced,
sometimes more advanced in some of the changes.
- Casual speech is the least-monitored
style of speech, in which speakers pay less attention to how they are
speaking than to the content of what they are saying. Casual speech
is formally defined as speech that is directed to people other than
the interviewer and narratives, in which
the telling of important events takes precedence over the constrained
self-consciousness of the interview situation. Careful speech is all
other extemporaneous speech.
- ... advanced7.6
- That is, phonetically
advanced along the paths of the sound changes in progress in her
- Including 5 from
Detroit, 4 from Buffalo, 1 from Rochester, and 4 from Chicago
- ...
- FLEECE-type words have lexically stressed /iy/,
while HAPPY-type words have a lexically unstressed vowel.
- ...cwe:struct.7.9
- The lexical sets corresponding to these
phonological classes are:
V |
V: |
Vr |
marry |
Vy |
Vw |
Unstressed |
21#21 |
21#21 |
21#21 |
21#21 |
21#21 |
21#21 |
21#21 |
21#21 |
21#21 |
- See LYS for the first discussion of this shift
that I am aware of.
- Some
ingliding does occur with this vowel (4 of the first 30 stressed /æ/
tokens), but none of the first ten sounded ingliding to my ear,
perhaps because these were mostly non-final, thus not especially
lengthened tokens.
- In some
productions // moves downward towards [æ].
- Bloomfield
(1934), when he phonemicized the TRAP, BATH classes (here, /æ/) as
/æ/ , and the STRUT class (here /⋀/) as /o/, would seem to have
been prescient (unless he was influenced by the Northern Cities Chain
Shift already in progress, which is unlikely), since /æ/ rises
towards [e~ ] and /⋀/ backs towards /ɔ/.
- But cf. recent revisions in Labov, forthcoming.
- Fronting is relative to
the values in Reference American, where /u:, / are realized as
[w, ].
- The vowel /ow/
is sometimes monophthongal, as is more typical of Minnesota or
Wisconsin, but more often has an upglide.
- In an important study of the
effects of situational style-shifting on a speaker's vowel system,
Hindle (1980) measured even more tokens (33#33 10,000) in the speech of a
single Philadelphia speaker, recorded during work, at dinner with her
family, and during a bridge game with friends. The situation-related
vowel shifts are closely related to the historical changes ongoing in
Philadelphian sound system, in that sounds in more vernacular,
informal-style speech is more historically advanced.
- Failure to
decode the sounds of speech does occur; language does not always
function successfully, as Labov's project on Cross-Dialectal
Comprehension shows in detail.
- However, according to the discussion on
, /æ/-raising is the second step, which follows
the loss of the Reference American /a:/ phoneme, broad A, by merger
with /ɑ/. It was shown above that /a:/ is not a distinct class
in Chicago.
- Chapter 3 argues that postvocalic
/r/ is a glide, and forms diphthongs with preceding nuclei.
- ``Clitics'' are defined in the Methods chapter.
- Feagin (1991) summarizes
the general social picture regarding the disappearance of Southern
r-lessness and gliding:
Anecdotal evidence supports this hypothesis. A young upper-class boy
from Anniston -- the son of a banker -- who was attending Amherst
College in Massachusetts told me recently that people often are
surprised that he is from the South. That is probably because of the
stereotype of Southern speech which is that Southern States English is
R-less and full of diphthongs on vowels which in Northern States
English have none. Actually, the stereotype is not altogether
incorrect, for a few older working class people, but it is out of
date. The majority of young Southerners -- of whatever social class
-- have R's now. What the stereotype is depicting for the
glides turns out to be working class speech, at least for
younger people -- or the speech of older women, of whatever class.
- Based on distributions of lexical items rather than
phonological features.
- To ease the usually difficult task of
understanding what sounds are being referred to, I present here the
lexical sets of Wells (1982) that correspond to the sound classes in
the given structure. (For further explanation, see the section on
English Lexical Sets in Phonological Preliminaries.)
V |
Vy |
Vw |
Vr |
21#21 |
21#21 |
21#21 |
21#21 |
- It has been claimed (for example, by C.-J.
Bailey 1985:205) that as dialects of English, Southern and Northern
(and by extension, Reference American) share an underlying vowel
system. The systems under discussion here are surface-phonological,
or post-lexical in the theory of Lexical Phonology, and they are not
the same across these dialects. Surface phonological structures are
the output of the lexical phonology, and their details of
implementation are assumed not to be subject to the direct influence
of morphological or grammatical features in specifying their phonetic
realization. The topic here is not lexical phonology, but
post-lexical phonology.
- In fact, 56% of non-native-born Tuscaloosans in
1850 were from the Carolinas or Virginia (Foley 1972:3)
- This is, or was, a Southern shibboleth:
monophthongs in the phrases ``nice white rice'', or ``bright light
tonight'' are stigmatized (Feagin, p.c., C.-J. Bailey, p.c.). Thus,
speakers that monophthongize /ay/ before voiceless obstruents are
marked as relatively lower-class speakers, though all speakers
monophthongize in some other environments (C.-J. Bailey 1980:171).
- Here as in SPE, () marks
an optional element, and {} marks a choice among elements. It may
be pointed out that this use of () notation is incompatible with C.-J.
Bailey's use of () within phonetic brackets, [..(X)..], which
signifies a particular kind of systematic ``optionality'', namely that
X is present in monitored styles, and absent in unmonitored styles.
- See also McDavid (1948)
for a very early sociolinguistic study of /r/ in another Southern
- Steven
Peloquin, an undergraduate linguistics student at the University of
Pennsylvania, has discovered a merger of chair and cheer,
bear and beer, hair and here, etc., in his
Rhode Island dialect. I have found two older Rhode Islanders that
do not share the merger, according to minimal pair tests. But
another one, a younger man, seems to share the merger, according to
minimal pair tests, and furthermore merges the corresponding back
vowels, the sets CURE and FORCE (also NORTH), so that bore and
boor, tore and tour, etc., sound the same.
- cf. Foley (1972:47, #13) and Kenyon, American
Pronunciation, 10th ed., pp366-372, among many references. For this
reason the Northern dialects which neutralize this distinction cannot
be used as a starting point for the description of Southern dialects:
it would be impossible to derive the membership of the FORCE and NORTH
classes in a dialect that distinguishes them from a dialect which
merges them. Wells' lexical sets are crucial in this context because
they can be used to define most of the relevant sound classes in all
English dialects, including this contrast in Alabama speech.
- The
merger of the Mary, merry classes occurs in many other
dialects, including Los Angeles Chicano English, and Chicago White
English, and Jamaican Creole.
- C.-J. Bailey (1985:162) is
the source of the Ayre/heir/air triplet. Another
given there is they're:their:there.
- Here, $ represents a
syllable boundary, and /y/ represents not phonetic [y] but an abstract
feature that is sometimes realized by up-gliding and sometimes by
length. Surface phonetic [ay] sequences do occur, from phonological
/a$y/ sequences. But while onset /y/ may surface as [y], coda /y/
does not. This is illustrated by the (r-less) Southern minimal pair,
Maya [may], vs. Myer [ma:], pointed out to me
by C.-J. Bailey (see, for example, 1985:118).
- It should also be noted that /aw/ before /r/
remains disyllabic (as in ``flowers'', Foley 1972:36), as does
/y+r/, just as in Reference American.
- This is surprising, since
monophthongal /ay/ may be categorical in some environments.
- In r-less
Tidewater Southern speech (as in Boston), the /r/ class is pronounced
with a low-fronted vowel [a:].
- According to Foley
(1972:50), [] is more restricted than [],
occurring in /r, or/ words for blacks, but only in /or/ words for
- C.-J. Bailey points
out to me that Southern boy as a call to a dog, or a threat to a
youth, can undergo the identical shift as French moi: [bwa:],
and similarly that goin' may be rendered as gwine.
- The resident Mexicans, or
``Californios'', who remained in California after the Mexican war in
which California was taken away from Mexico, are a tiny population,
mostly assimilated to the Anglo community. The entire recorded
population of Los Angeles in 1820 was 615, and of California at the
time of the 1848 war, about 15,000. The populations discussed here,
both Chicanos and Anglos, is entirely composed of later immigrants.
- As in FLEECE, KIT,
GOOSE, FOOT, respectively.
- As in
- The issue here is whether or not
there was exposure to natively-spoken English in the household during
childhood. The first person in the family with native competence is
the first-generation speaker, in this view. All those in the family
with a native-English speaking parent are exposed during childhood to
natively spoken English, whether or not there is also non-native
English in the household, spoken by family members that are immigrants.
Thus if even a single grandparent is a native English speaker, the
grandchild of such a speaker is third-generation.
- The population in 1930 of East
L.A. was 70% of 1970's population, whereas in West L.A. it was 3%.
- We have collaborated on the acoustic
analysis of this and other speakers, working together to some extent
since our interests are complementary. His work focuses on social
variation within the Los Angeles Chicano English speech community,
considering consonant deletion, and variation in the phonetic quality
of stressed vowels, while my work focuses on the internal sources of
variation in vowel quality, according to consonant environment and
- Using the term ``merger'' is
misleading because actual phonological mergers have not necessarily
taken place in each case in this particular speech community.
Nevertheless, this remains a convenient term when referring to lexical
sets that are not distinguished.
- Thus, my parents, both born and raised on the West
Coast, maintain the distinction, while I and my sisters do not.
Previous studies of this merger are discussed below.
- cf. Bailey, 1985:187.
- I view ambiguity as a greater sin
than lack of correspondence between graphemic and phonological
representation. The typography used, with /i:, I, u:, U/
as the high front and back vowels, is intended to avoid the ambiguity
of /i/ and /u/ which arises when these 4 vowels are alternately
symbolized with the sets /i, I, u, U/ or /i:, i, u:, u/.
Similarly, ``e'', ``o'' are avoided due to this ambiguity.
- The lexical sets (of Wells, 1982) that correspond to these
phonological classes are given below in order to clarify the
correspondences of these classes with those of other dialects,
including the reader's. See page
ff for
discussion of what lexical sets signify and how they are useful.
V |
V: |
Vr |
Vy |
Vw |
21#21 |
21#21 |
21#21 |
21#21 |
21#21 |
21#21 |
21#21 |
21#21 |
21#21 LOT |
21#21 |
21#21 |
21#21 |
BATH is included with TRAP; FORCE is included with NORTH; PALM, CLOTH,
and THOUGHT are included with LOT, Unstressed vowel classes HAPPY, COMMA,
LETTER, and the /ɝ/ vowel, NURSE, are not shown here.
- This is
formalized on page
for Reference American: Stressed
rhymes branch.
- These vowels are
sometimes written more phonetically, as they occur in other dialects:
/iy, ey, ow, uw/. Both transcriptions can be accurate at the same
time, depicting different stages of the derivation of the phonetic
form. There is no problem of ambiguity among the classes referred to,
and the particular transcription chosen is an arbitrary decision.
However, in this dialect, these vowels are frequently monophthongal
-- probably an ethnic marker that the LA Anglos do not share.
- Note that $ signifies a syllable boundary.
- For example, a
near minimal pair would have been marry vs. sorry. The
contrast between them here is not simply front vs. back, but also mid
vs. low.
- This may have first been noticed by Labov
for Philadelphian white vernacular, and has been shown to occur much
more widely, especially throughout the American South, by Thomas,
Bailey, & Benson, 1990.
- See discussion of this point on
- For
justification of the phonetic importance of F1 and F2, see the
Acoustics chapter (e.g., page
). For the defininition
of acoustic vowel, see page
. For the measurement
procedure, see the Methods chapter (page
- Of the
remaining back-round vowels, /:/ has merged with //, and
/ur, oy/ are too infrequent in this data to display in this form (n=0,
n=5, respectively).
- This supports the
point made with respect to Reference American (page
) that
// is not back but central.
- A pre-medical student at UC
San Diego, she is an upwardly mobile middle-class speaker. She
participates in, though probably is not among the leaders of, ongoing
Anglo sound changes.
- Acoustic vowel is defined on
- For the precise definition of this
distinction see page
- It should also be noted that the word you,
which occurs quite frequently, was phonologized as /y/. It never
sounds back or rounded, even when stressed. The single token of
stressed // that occurs in the clitics-included chart is from
the word, ``you''.
- Two-tailed t-test, 5% level
of confidence, as described on page
- For example, // is
underlyingly a Glide slot specified with the feature [rhotic]. This
feature is then linked, after nucleus-insertion, to the nucleus timing
slot. See Phonological Preliminaries, page
- Herold, 1990:8,
quoting Trudgill, 1986:147.
- cf. DeCamp, 1959; Johnson,
1974; limitations of these studies are pointed out in Herold
(1990:8-11) and are overcome in systematic instrumental work by
Moonwomon, 1991.
- See,
for example, Labov, Yaeger, & Steiner (1972:94).
- See Labov,
- The wing movements of hummingbirds and
bees, for example, though more rapid, appear to be less complex (as
well as smaller in physical scale), since they follow a single,
repetitive pattern of motion, while the human tongue moves in leaps
and contortions that, while not entirely unpredictable, are not yet
well understood.
- Axiomatized: For
every time t1, there is a time t2 after t1, and a time t3 before t1.
- Let
F(e) be a function from elements to sets of elements such that for any
pair of elements, e1 and e2, if e1 is in F(e2), then F(e1) is a proper
subset of F(e2). If such a function exists, the set is called a
linear sequence. Note that F(e1) does not contain e1, since if e1
were in F(e1), then F(e1) would be a proper subset of itself, which
cannot be true. We may define P(e) as a function from elements to
sets of elements: P(e) = (e 72#72 F(e)). In words, P(e) is ``everything
else except e and F(e)''. The union of P(e), e, and F(e) thus
exhausts all the elements in the universe under discussion; further,
they do not intersect. We may say that elements in P(e) are 'before'
e, and elements in F(e) are 'after' e, though we could as well say
that P(e) is after and F(e) is before e, and linearity would still be
- In discrete time
staticity may be defined as the property that an event may occur at
more than one time, s.t. the times at which it occurs form a
``contiguous sequence''. A set of times T forms a contiguous sequence
iff there is no proper subset T1 of T for which T1 and T 73#73
T174#74 (the complement of T1) have no ``adjacent'' members. Times
t1 and t2 are adjacent iff there is no time t3 such that either
t119#19t319#19t2 or t219#19t319#19t1, where ``19#19'' may be defined trivially
using the functions F() and P(): t119#19t2 iff t1 75#75 P(t2).
- As implied above, these terms are
synonymous within this discussion.
- Thanks to Richard Janda for this observation.
- This paper further opens up questions of the
cognitive representations of actual time which are not often
- At the same time, other aspects of
the GVS did not go as far in Jamaican Creole as elsewhere: the
lowering of Middle English /u:/ (as in MOUTH, discussed above) did not
go ``to completion'' as it did in other dialects.
- Herold (1991) provides a detailed examination
of the conditions of merger.
- For example, after
several years as a graduate student in a relic area where a merger
that occurs in my native dialect had not occurred (/
/), spending my time explicitly studying speech sounds, I may
have acquired the distinction, at least in some words where spelling
gives the correct clue. Thus I was recently unable to understand a
low-back-merger speaker claiming someone was a [frd], because
I expected [] in the word fraud and somehow couldn't
understand what a frodd could possibly be. So insofar as I can
be confused by the wrong phonetic form in a word, I may be
functioning rather like a relic-area speaker.
- Cf. Graff, Labov, and Harris,
1986, and other papers in that group, which showed that blacks
successfully imitating Philadelphia white speakers front /ow/ but not