The central statistical problem of these studies is that of demonstrating whether the distributions of two sound classes in F1-F2 space are the same or different. Different distributions in this acoustic space are evidence for some linguistic difference, whether phonetic or phonological, as argued earlier. Therefore the problem of characterizing a difference as significant or insignificant is crucial.
If two categories of data are normally distributed, then well-understood analytical methods are available to test the hypothesis that the two categories are different. Such methods exist even for multivariate data, and even for log-normal distributions. However, formant-frequency data is frequently not normally distributed. For example, the distribution of // in the Chicago chapter (page is quite non-normal, as are any number of charts of raw formant distributions.
A test exists for testing the question, Do two distributions have different means? which does not depend on the normality assumption: the Wilcoxon test. But tests of this nature for multi-dimensional data are less well-known (though see Maekawa 1989 for a two-dimensional t-test, not used here). I use two methods in dealing with this problem. The first is used as a technique of visually displaying differences in a way that is easily interpreted. The second is used for numerically estimating the statistical significance of the difference between two sets of measurements.