In the table of vowels each cell links to a list of minimal pairs involving the phonemes in the relevant column and row. The numbers in north-eastern half of the table are the actual numbers of pairs identified. The numbers in the south-western half give an indication of the importance or difficulty of the pair calculated as follows: from a maximum of 6, deduct 1 for difference between vowel and diphthong, 1 for a difference of length within monophthongs, 1 for difference of direction within diphthongs, 1 for a difference in lip-rounding, and then for the distance apart of the starting tongue position deduct 1 for a distance of up to one cardinal vowel, 2 for up to two cardinal vowels, 3 for any wider distance. Thus a score of 4 or 5 would show two very similar sounds, a contrast likely to be a cause of difficulty for some or all learners, while a score of 1 or 2 would be unlikely to cause problems.
What are minimal pairs?
Minimal pairs are pairs of words whose pronunciation differs at only one segment, such as sheep and ship or lice and rice . They are often used in listening tests and pronunciation exercises. Theoretically it is the existence of minimal pairs which enables linguists to build up the phoneme inventory for a language or dialect, though the process is not without difficulty.
Each cell in the tables above is a link to a list of minimal pairs derived from a dictionary . Use the tables of vowels and consonants to retrieve the relevant lists. All the vowel and consonant lists have now been edited and commented on. Earlier versions of the lists included only one pair for each pronunciation, such as heal/hole . Newly revised versions have been added which include all the pairs which arise when one or both members of the pair have a homophone, so giving a better indication of how much confusion a given pair may cause. In the case of heal/hole , for instance, the new version of the list would include all of the following:
- heals holes
- healed holed
- healing holing
- heals wholes
- heels holes
- heeled holed
- heeling holing
- heels wholes
Please note that, as you move the mouse over a link, the name of the relevant document should appear at the bottom of the browser window and this gives a further indication of which sound contrast is featured in the list.
Source of the lists: Roger Mitton and The Advanced Learners' Dictionary
Hal Gleason (1955, p. 19), writing about minimal pairs before the era of widespread computing, said "Presumably by diligent search through the total vocabulary, minimal pairs might be found for all English consonant phonemes. But there is no guarantee that all will be found, and in any case it is hardly a feasible procedure."
I have not tried to search the total vocabulary, but I have tried to search a vocabulary which includes most of the words available in non-specialist contexts to everyday users of English. In putting together these lists I have used Roger Mitton's machine-readable version of the 1974 edition of the Advanced Learners Dictionary, incorporating Mitton's 1990 additions to the word list (see Mitton 1996). The minimal pair lists below have been prepared from the dictionary by means of a program which sorts the pronunciation field, identifies identical pairs (homophones), substitutes dummy characters for the symbols of the minimal pair, and then flags all the additional homophone pairs created by the process. This generates (fairly) complete lists of minimal pairs, though a certain amount of rather tedious post-editing is needed.
Semantic loading and density
When this project (collecting and editing minimal pair lists for all the 510 theoretically possible contrasts) is complete, I hope to be in a position to measure the functional load of a pronunciation error, ie how much potential for confusion is created by a particular vowel or consonant error and therefore how important it is. Naturally this is not just a matter of counting the number of pairs, but also depends on other factors. One of these is the part of speech of the words and therefore their potential for appearing in the same contexts. Two nouns, such as beer and pier , are much more confusable than a noun and a preposition, such as frog and from . For this reason the edited lists draw a distinction between the number of pairs and number of semantic contrasts realised by the pairs, and calculate a "semantic loading" figure. Thus if there were 100 pairs but they belonged to only 70 different pairs of headwords, the semantic loading would be 70%. For the longer lists the semantic loading tends to fall within the range 48% to 60%, but the very short lists involving rare sounds are often higher. Paradoxically, the lower the semantic loading, the more confusable pairs may exist for that contrast, since a smaller number shows there are many inflected forms in the list and signals a large number of words in the open classes: noun, verb or adjective. To some extent the figure is arbitrarily dependent on editorial decisions. I have, for instance, treated agent nouns as separate headwords from their verb roots, since there is often a large shift of meaning, as in wait/waiter .
It is also important to take into account the density of the minimal pair, namely how the actual total relates to the theoretically possible number if every word containing one of the sounds were matched by a word containing the other. This would show how the distribution of minimal pairs relates to the overall phoneme frequencies in the same dictionary. A 100% match could only occur if there were exactly the same number of words with each sound in the language, and that is clearly unlikely. But, if the number is unequal, the density depends on which sound you start with. There are 37,729 words in the dictionary containing the vowel /ɪ/ and only 784 containing the diphthong /ɔɪ/. There are 62 minimal pairs. For the diphthong this is a density of 7.9%, but for the monophthong the density is only 0.16%. For an average of diphthong plus monophthong it is 0.32% (calculated using the harmonic mean, of course). What I have decided to do is report the mean density, pointing out where, as in this case, there is a large discrepancy in frequency.
The O'Connor conjectureIt is also my ambition to examine the statistical data coming our of these lists and to see if it offers any evidence for or against what I call "the O'Connor conjecture" that language is self-repairing. I don't know if J.D.O'Connor was the first person to express this, but he presents a very simple and clear statement of it in his book Phonetics .
A language can tolerate quite a lot of homophones provided they do not get in each other’s way, that is provided they are not likely to occur in the same contexts. This may be a grammatical matter: if the homophones are different parts of speech they are not likely to turn up in the same place in a sentence … If they are the same part of speech, e.g. site sight ; pear, pair they can be tolerated unless they occur in the same area of meaning and in association with a similar set of other words. Site may be ambiguous in It’s a nice site , though a wider context will usually make the choice plain. … If homophones do interfere with each other the language may react by getting rid of one or by modifying one.What minimal pairs do is increase the potential number of homophones in a learner's speech or the potential for misunderstaning between speakers of different dialects. What we would expect, therefore, is for there to be more minimal pairs between sounds which differ greatly, such as peat/part or shake/wake , and fewer between sounds which are close enough to create problems for learners such as cot/caught or pie/buy . So far the evidence I have collected does not support a strong form of the conjecture.
There are a number of problems waiting to be resolved:
- Can there be a minimal pair contrast between a vowel and a consonant? Theory would suggest not, since vowels and consonants have different functions in syllable structure. However, one can find pairs such as screen/serene which appear to contrast /k/ and /ə/ , but then the syllable count and stress pattern seem to make such pairs differ by more than one sound. The dictionary is now being searched for such pairs, and the results are included in the "cons" column in the vowel tables and the "vowel" column in the consonant tables.
- Related to the previous question is the problem of syllabic versus non-syllabic consonants. Is the contrast beween name and same of the same type as the contrast between button and butts ? The computer program treats them as a minimal pair, though ordinary perception would deny it.
- The so-called dark -l is a particular problem in this respect. It often seems to be intermediate between syllabic and non-syllabic in function. Take the pair dial/file . These would appear to be both monosyllables and a minimal pair. But the pair dialling/filing no longer seem to be minimal. The -l in dialling remains a dark -l and makes the form into a three-syllable word, while the -l in filing becomes a clear -l so that the word has two syllables. This difference seems to be largely driven by the spelling.
- Can we admit minimal pairs where a sound is paired with a null? For example, could back and bank be a minimal pair? If so, the inventory of pairs would become much larger. I have begun to list these pairs, and they are shown in the "null" columns in the tables.
- In some cases the inflected forms of a base pair such as seep and scene appear to be non-minimal. Where the pronunciation /sip/ and /sin/ have only one difference, the -s ending turns them into /sips/ and /sinz/, apparently showing two differences. Some research carried out by Merwyn Torikian under my supervision in 1992 used sound analysis software to investigate this and found that the physical difference in the inflections is insignificant. In the case of pairs like docks/dogs or seeps/scenes , the whole syllable is affected by the voicing or devoicing, but the final /s/ or /z/ shows up on a spectrogram as almost identical. Therefore all such pairs (including past tense endings) have been added to the lists. This does lead to an anomaly in the case of the /t/ versus /n/ contrast, since a word like wits enters pairs with wins and wince . The interesting point here is that the wins/wince contrast is not so much between /z/ and /s/ as between a fully voiced /ɪn/ and a partially devoiced /ɪn/ .
You will find two related lists derived from the same dictionary source at the following links:can be found here .
ReferencesGleason, Hal (1955). An Introduction to Descriptive Linguistics , Holt Rinehart Winston.
Mitton, Roger (1996). English spelling and the computer . Longman.
O'Connor, J.D. (1973). Phonetics . Penguin Books.
Swan, Michael and Smith, Bernard (1987). Learner English; a teacher's guide to interference and other problems . Cambridge University Press.
Torikian, Merwyn (1992). “Watch your language; an account of Soundedit with reference to the validity of phonological rules.” System , 20, 4, p. 471-480.
Minimal pairs for English RP: lists by John Higgins