You Think You Know English Phonetics and Pronunciation? Find Out!


The study of speech - Phonetics

What do we do when we speak? First, we use certain parts of our body to produce an airstream inwards and outwards. This airstream is created intentionally to produce certain sounds. These sounds are transmitted as sound waves, which are perceived by a hearer.

This hearer receives the sound wave, decodifies it and, then, interprets it. “Phonetics and Phonology are concerned with speech – with the ways in which humans produce and hear speech.” (Clark & Yallop, 1990: 1) These two disciplines study the production and reception of speech sounds in all their complexity.

That is, they study the whole process and all the mechanisms involved in the production and reception of speech. On the one hand, Phonetics is concerned with the anatomy and physiology of speech, with the production organs (Articulatory Phonetics), with the sound wave (Acoustic Phonetics), and with the perception of that sound wave (Auditory or Perceptual Phonetics).

Phonetics and Phonology

On the other hand, Phonology is concerned with the “systems and patterns of sounds that occur in particular languages.” (Ibid.: 2). So, research on the way articulatory organs work to produce a given sound or a sound feature is normally considered a phonetic investigation, whereas research on the total number of vowel and consonant sounds in English is considered phonological in nature.

Then, “phoneticians are likely to draw on methods and techniques used in the natural sciences (while) phonologists may profess to be more concerned with the mental organization of language.” (ibid.: 3)

However, both disciplines must be studied together. Phonetics and phonology only study two different aspects of the same reality. According to Peter Roach (1991), phonology is “the study of the abstract side of the sounds of language” whereas phonetics studies “the actual realizations”. Speech is a complex human phenomenon that involves mental and physical components and both phonetics and phonology aim at accounting for this complexity.

The Production of Speech Sounds

How can we produce speech? In this section, we will study the production of speech sounds from an articulatory point of view in order to understand better subsequent sections about vowel and consonant sounds.

It must be said that speech does not start in the lungs. It starts in the brain and is, then, studied by Psycholinguistics. After the creation of the message and the lexico-grammatical structure in our mind, we need a representation of the sound sequence and a number of commands which will be executed by our speech organs to produce the utterance.

So, we need a phonetic plan and a motor plan (Belinchón, Igoa y Rivière, 1994: 590) After these metal operations we come to the physical production of sounds. Speech, then, is produced by an air stream from the lungs, which goes through the trachea and the oral and nasal cavities. It involves four processes: Initiation, phonation, oro-nasal process, and articulation.

The initiation process is the moment when the air is expelled from the lungs. In English, speech sounds are the result of “a pulmonic egressive airstream” (Giegerich, 1992) although that is not the case in all languages (ingressive sounds). The phonation process occurs at the larynx. The larynx has two horizontal folds of tissue in the passage of air; they are the vocal folds. The gap between these folds is called the glottis.

The glottis can be closed, Then, no air can pass. Or it can have a narrow opening which can make the vocal folds vibrate producing the “voiced sounds”. Finally, it can be wide open, as in normal breathing, and, thus, the vibration of the vocal folds is reduced, producing the “voiceless sounds”. After it has gone through the larynx and the pharynx, the air can go into the nasal or the oral cavity. The velum is the part responsible for that selection. Through the oro-nasal process we can differentiate between the nasal consonants (/m/, /n/ ) and other sounds.

Finally, the articulation process is the most obvious one: it takes place in the mouth and it is the process through which we can differentiate most speech sounds. In the mouth, we can distinguish between the oral cavity, which acts as a resonator, and the articulators, which can be active or passive: upper and lower lips, upper and lower teeth, tongue (tip, blade, front, back), and roof of the mouth (alveolar ridge, palate, and velum). So, speech sounds are distinguished from one another in terms of the place where and the manner how they are articulated.

Stress in pronunciation

As a sound phenomenon, can be studied from two points of view: production and perception. The production of stressed syllables is said to imply greater muscular energy than the production of unstressed syllables. From the perceptive point of view, stressed syllables are prominent. Prominence is the sum of different factors such as loudness, length, pitch and quality.

There are three possibilities of stress in a word: primary stress, characterized by prominence and, basically, by a rise-fall tone; secondary stress, weaker than the primary stress but stronger than that of the unstressed syllables (,photo’graphic); and unstressed syllables, defined by the absence of any prominence, becoming then the background against the prominent stressed syllables appear. Unstressed syllables normally have the short closed vowels /i/ or /u/ and the schwa.

From the teaching perspective, there are two ideas to be marked: first, “incorrect stress placement is the major cause of intelligibility problems for foreign learners, and is, therefore, a subject that needs to be treated very seriously.” (Roach, 1991:91). Second, the rules for word stress from the phonological point of view are too complex to use in the language classroom, so implicit ways of teaching are required.

Dictionaries with phonemic transcriptions can be helpful. Another important aspect related to stress is that of the “weak forms”. There is a number of words in English (almost all of which belong to the category called function or grammatical words) that can be pronounced in two different ways, a strong and a weak form.

There are about forty such words and it is important to be aware of their existence as they can provoke misunderstandings. English-speaking people find the strong forms unnatural and learners of English can misunderstand English speakers, who will surely use weak forms. There are some rules to learn.

The strong form will be used when

  • a) they occur at the end of a sentence, as in “Chips are what I’m fond of”.
  • b) A weak-form word is being contrasted with another word, as in “The letter is from him, not to him.&rdquo
  • c) A weak-form word is given stress for the purpose of emphasis, as in “You must give me more money.”
  • d) A weak-form is being “cited” or “quoted”, as in “You shouldn’t put “and” at the end of a sentence.”

The most common weak-form words

  • THE ði
  • A eɪ
  • AND ænd
  • BUT bʌt
  • THAT ðæt
  • THAN ðæn
  • AT æt
  • FOR fɔr
  • FROM frʌm
  • OF ʌv
  • TO tu
  • AS æz
  • SOME sʌm
  • CAN, COULD kæn, kʊd
  • HAVE, HAS, HAD hæv, hæz, hæd
  • SHALL, SHOULD ʃæl, ʃʊd
  • MUST mʌst
  • DO, DOES du, dʌz
  • AM, IS, ARE, WAS, WERE æm, ɪz, ɑr, wʌz, wɜr

Intonation in English - why is important

It is not easy to define Intonation. We know that the basic feature of intonation is pitch, being high or low. The overall behavior of the pitch is called tone. Tones can be static, level tones, or moving tones, either rising or falling. For the purpose of analyzing intonation, a unit is normally used called the tone-unit.

Tone-units consist of at least one tonic syllable (a tonic syllable being a syllable with tone and prominence). Tone-units also have a “head”, which is that part of the tone-unit that extends from the first stressed syllable up to (but not including) the tonic syllable. Before the head, there may be a pre-head, which includes all the unstressed syllables in a tone unit preceding the first stressed syllables. Sometimes there is even a “tail”, that is, some syllables following the tonic syllable up to the end of the tone unit.

So, the structure of a tone-unit is (pre-head) (head) tonic syllable (tail). Intonation is very important for communication, as it helps the addressee interpret the message. There have been different proposals to explain how intonation can help communication, some of which are:

  • 1. Intonation enables us to express emotions and attitudes as we speak: the attitudinal function of intonation.
  • 2. Intonation helps to produce the effect of prominence on stressed syllables: the accentual function of intonation.
  • 3. Intonation helps to recognize the grammar and syntactic structure of the utterance: the grammatical function of intonation.
  • 4. Intonation conveys the given-new information or provides information for turn-taking: the discourse function of intonation. So, there are three simple possibilities for intonation: level, fall, and rise.

However, more complex tones are also used, such as fall-rise or rise-fall. Each of these tones is functionally distinct, that is, they convey different attitudes, intentions, and meanings to the hearer, as has been stated above.

Thus, the fall tone is regarded as quite “neutral” and it conveys a certain sense of “finality” (so, it is normally used to yield the floor in turn-taking). The rising tone, on the other hand, conveys an impression that something more is to follow (so, it is frequently used to keep the floor in turn-taking).

The fall-rise tone is quite frequent and it conveys, among many other possibilities, “limited agreement” or “response with reservations”. The rise-fall tone is normally used to convey strong feelings of approval, disapproval, or surprise.

Syllables in English

The Sonority Hierarchy In any utterance some sounds stand as more prominent or sonorous than others...A sonority scale or hierarchy can be set up which represents the relative sonority of various classes of sound:

  • Open vowels
  • Close vowels
  • Laterals Nasals
  • Approximants
  • Trills Fricatives Affricates
  • Plosives and flaps

Using the sonority hierarchy we can draw a contour representing the varying prominences of an utterance. The number of syllables in an utterance equates with the number of peaks of sonority [being aware that sound] classes from fricatives downwards cannot constitute peaks in English. There are three elements in a syllable.

The onset, peak, and coda of a syllable form a hierarchy of constituents, in which the coda is more closely associated with the peak than with the onset [thus conforming to the rhyme]. Onset generally involves increasing sonority up to the peak (...) whereas codas generally involve decreasing sonority.

How to divide syllables in medial position?

Various principles can be applied to decide between alternatives: align syllable boundaries with morpheme boundaries where present (the morphemic principle): align syllable boundaries to parallel syllable codas and onsets at the ends and beginnings of words (the phonotactic principle); align syllable boundaries to best predict allophonic variation, e.g. the devoicing of /r/ following /t/ (the allophonic principle).

Unfortunately, such principles often conflict with one another. A further principle is often invoked in such cases, the maximal onset principle, which assigns consonants to onsets wherever possible and is said to be universal in languages; but this itself often conflicts with one or more of the principles above.

"Words" definition according to phonetics

The word, composed of one or more phonemes, has a separate linguistic identity, in that it is a commutable entity, higher than the phoneme, which may either constitute a complete utterance or may be substituted in a longer utterance for other words of its same class.

The syllable or syllables of a word that stands out from the remainder are said to be accepted, to receive the accent. The accentual pattern of English words is fixed but free. Accent and prominence Any of four factors, pitch, loudness, quality, and quantity may help to render a syllable more prominent than its neighbors.

But it is principally pitch change that marks an accented syllable. The final pitch accent in a word or in a group of words is usually the most prominent (and hence referred to as the primary accent) while a pitch accent on an earlier syllable is usually somewhat less prominent (and referred to as secondary accent). Accented syllables are often assumed to be louder than unaccented syllables and in many cases, this may be so.

However, loudness is not by itself an efficient device for signaling the location of the accent of English. While the accent is primarily achieved by pitch change, sometimes assisted by extra loudness, among unaccented syllables some will be more prominent than others due to the quality and quantity of the vowels at their center

Long vowels and diphthongs are generally more prominent than short vowels, while among the short vowels themselves - short a, e, i, o, and u: /æ, ɛ, ɪ, ɑ, ʌ/ (cat, bed, sit, top, sun) - (when unaccented) are the least prominent and are often referred to as reduced vowels as opposed to other full vowels.

There are four degrees of prominence in English:

  • a) primary accent, marked by the last major pitch change in a word (or longer utterance);
  • b) secondary accent, marked by a non-final pitch change in a word (or longer utterance);
  • c) a minor prominence produced by the occurrence of a full vowel but containing no pitch changes
  • d) a non-prominent syllable containing no pitch change and one of the vowels

Distinctive Word Accentual Patterns

The accentual pattern of a word establishes the relationship of its parts; it may also have a distinctive function in that it opposes words of comparable sound structure (and identical spelling). Such word oppositions (for the most part disyllables of French origin) may or may not involve phonemic changes of quality.

A relatively small number of pairs of nouns and verbs may differ only in the location of the primary accent, this falling on the first syllable in the nouns and on the second in the verbs (accent, digest, torment, transfer, transport).

In a somewhat larger number of pairs the occurrence in the first syllable of the verb is more regular (combine, compress, concert, conduct, contract, contrast, convict, desert, export, object, present, proceeds, produce, progress).


It has always been a feature of the structure of English words that the weakly accented syllables have undergone a process of reduction, including loss of phonemes or of vowels. The same process of reduction, with resultant contraction, may be observed in operation in PresE.

Vowel Elision

Established: initially in the state, scholar or sample; medially in Gloucester, marriage, evening, chimney,..; finally in time, name, loved, hands, eaten, written, cousin.

Present colloquial:

  • consonant + // + // + weak vowel = elision of // as in preferable, repertory, temporary, murderer, etc. o // + weak vowel + consonant = elision of weak vowel as in Dorothy (recent development)
  • consonant + weak vowel + // = elision of weak vowel (or reduction of dark // into clear //) as in fatalist, bachelor or insolent.
  • Elision of post-primary // as in university, probably, difficult, national, fashionable or government.
  • Loss of syllabicity in present participles of verbs such as flavour, lighten or thicken where the // may be elided or the syllabic consonant [] replaced by a non-syllabic consonant.
  • In pre-primary positions, // or // of the weak syllable preceding the primary accent is apt to be lost in very rapid speech, especially when the syllable with primary accent has initial // or // as in police, parade, correct, balloon or gorilla, but also in photography, suppose, perhaps, geometry or geography.

Consonant elision

Established: the reduction of many consonant clusters has long been established as initial /wr, kn, gn, hl, hr, hn/, medial /tn, tl/ or final /mb, mn/: write, know, gnaw, loaf, ring, nut, fasten, castle, lamb, hymn,... Present colloquial:

  • loss of alveolars /t, d/ when medial in a cluster of three consonants as in exactly, facts, handsome, lastly, Westminster, dustman,...
  • // is normally elided from asthma and isthmus, sometimes from months, twelfths, fifths, clothes.
  • Elision of /k/ and /l/ in asked and only may occur.
  • // is elided when preceded by // as in always. /p/ may be lost in clusters where its position is homorganic with that of an epenthetic plosive as in glimpse
  • Whole syllables may be elided in a rapid speech in the vicinity of /r/ or in a sequence of /r/ sounds: library.


The elision of /t/ in words like vents sometimes leads to the opposite tendency to insert an epenthetic /t/ in words like a dance. While epenthetic /t/ occurs between an /n/ and /, , /, similarly, an epenthetic /p/ or /k/ may occur between an /, / and a following fricative as in triumphs, confuse.