ENGLISH INTONATION

by Charles-James N. Bailey

[This is adapted and expanded  from "Intonation as a system" published as Working Paper No. 4 by the Faculty of Arts and Social Sciences of the Universiti Brunei Darussalam; 
although many of the materials included here have been  previously copyrighted
by the  author, the present arrangement for the Internet  is wholly new.].

© 1998 by Orchid Land Publications

[Rev.  9-19-98]

          Prosodic pitch can be segmental (as in native-American languages in Mexico) or suprasegmental, as in English.   (Segmental pitch is organized in tonits or tone units, which are relative to the sounds of the whole tune; suprasegmental pitch is organized in tunits or tone units.)   Many incorrect notions prevail on the subject of intonation--at least English intonation--as will be noted in due course.   To be noted here is the way misguided studies of stress confuse the tones of tunit cores or heads (these co-incide with a stressed syllable) and conclude that stress is partly tonal in English--which it is not:  All cores or heads fall on stressed syllables, but not all stressed syllables are cores or heads.  Since such studies have  confused stressed syllables in tunit cores or heads with stressed syllables in tunit tails, they have naturally produced inconsistent results.   Note in what follows that the tessatura is the pitch range; it may be widened or expanded, the upper limit being raised, for prominence, just as it may be narrowed, the upper limit being lowered or contracted for parenthesis.  The tempo may be contracted (speeded up) or protracted (slowed down) for various purposes.  Spaced letters indcate slower pronunciation.   Loudness is indicated by replacing small letters with capital letters.   Emphasis may be conveyed by pitch with or without expanding the tessatura or increasing the loudness; but it sometimes involves a de-focused low tone to distance the speaker from what is said. 
     Let us begin with the intonational tunit (tune unit), which is suprasegmental and expresses an intentional attitude, intention, or force--as noted below.   (A tonit, or tone unit, is segmental, distinguishing one vowel or diphthong, or syllable, from another with the same sound.)  A tune consists of a string of one or more tunits plus a cadence--the two building blocks of English intonation on the perceptual side; the faster the tempo, the fewer tunits will such a string of words contain.   A tune extends over an intonational phrase--which is bounded by the triple-cross boundary:  
#.  This boundary is immediately preceded by a cadence unless it begins an utterance. 
     The building blocks of English intonation involve three tones--high, mid, and low--as will be shown.  A tunit consists of (in sequence) a head (also called a core) plus a tail; the former is always a stressed syllable of a focused ([+ focus]) word.   (Stressed syllables of [- focus] words are not tunit heads.  The head may be an ordinarily unstressed syllable that is contrastively stressed, as in "I said DEceive, NOT REceive.")  A tail is more often unstressed but may include stressed syllables too.   A cadence is the final tail of the tune, so to speak; it consists of unstressed syllables (rarely of stressed syllables).    If no syllable is present to serve as a tail or cadence, the tone that the tail or cadence would have is added to the core as a glide from the core tone to what would be the tail tone.  Except for such cases, glides exaggerate tone drops or rises.  Everything that precedes the cadence is called the precadence.   The first tunit may be preceded by an introductory unstressed syllable or syllables called an anacrusis.  Except in special cases where an anacrusis is high or low to highlight the lowness or highness of the following head, the anacrutic tone is more or less neutral--i.e. mid-toned; but it may be high or low to agree with  what follows by signaling prominence (when high-toned) or (when low-toned) to destress and distance the speaker from what is said. 
      From what has been said, one can infer that there are three tones--high, mid, and low or rising, level, and falling.  But they are gradient.   This means that the pitch may be higher or lower within the range of any tone to emphasize or attenuate the force signaled by the tone itself:  The greater the force (of a static tone or of a   rising or falling tone), the greater the distance it falls or rises.  We speak of mid-tone as [x tone] or, gradiently, as [>x tone] or [>x tone].  High tone is gradient in a small degree.  Low tone is not gradient.
     Before characterizing the forces (volitional attitudes or intentions) of the building blocks, let it be said that except for abstract lines connecting successive heads--the slope or tangent--or tails--the envelope--the building blocks all have three values.   An envelope is usually jagged or spiky; insistence is created when the tune rises or falls smoothly--or, as we say, scandently.   A smooth tail is  called scandent when the tail syllables move up or down (or stay level) unidirectionly and more or less gradually.   In the following, the tails on the left exhibit a non-scandent drop; those in the middle, scandently falling; and the tune on the right is a scandent tune--indicating insistence (when rising) or matter-of-factness (when falling):

                we
        
rea         
               

     dy to       ar
                w
      
read           e
        y    
            to                       a

                        r
Both
          of
               'em
                        are  

                                 rea
                                       dy
                                             now.

Further examples will be provided  later.
     Though a tunit is a matter of tone or pitch, this is not true of lexical tonits (tonal units), which can include glottalization and other voice features (as in the Chinese minority languages studied by Jerrold Edmondson).    English tails are essentially subterflex (falling-rising) or circumflex (the more-marked rising-falling contour).   If there are not enough syllables for the contour, the head can be glided.   Otherwise, gliding stands out, as will be noted later.  The latter arm of a tail--the final rise of a subterflex tunit or the final fall of the less common circumflex tunit--may be truncated--or rather, as is the usual case,  the following head serves as the second arm of the tunit:

                                  pec
       not
               all  
                     that
                             ex       ted

                                ex
                       that  
               all                  pec
        not                  
                                          ted

In the foregoing, the two heads are italicized for purposes of identification.  The head -pec-  rises higher than the head not in both examples; -pec- would be lower than not if not were more salient than the word expected.
      A cadence is blended together with the preceding tail unless it moves in a different direction--up or down--from that of the tail, in which event it is simply added to the rise or fall of a tail.    Head tones are relatively higher than, level with, or lower than other heads; a falling tangent is usual, whereas a rising head indicates greater salience ([focus] clearly being gradient, i.e. [>focus] or [< focus], or even the same focus:  [x focus]) .  The tone of a tail is relative to that of its own head:   Falling is assertive and usual; rising is counterassertive or protesting and marked.  The basic system involves forces that are volitional or intentional--not emotional or syntactic:

 

falling tone

level tone

rising tone

core (head)
tail
cadence

old information
assertivity
finality

neutral
neutral
neutral

salience
counterassertivity
non-finality

Only in rare cases--one is mentioned later--are intonational contours more or less correlated with syntax.  A morphological use of the level tone in word-compounding is dealt near the end.  Otherwise, it is rarer than the other tonal types; it will be seen that it conveys a certain disinvolvement or performs a linking function.
       Transposition refers to the expansion of the upper limit of the tessatura upward; each tone will be higher than it would be with a lower such limit. De-focusing a tunit head is the opposite.  It changes a head to a low tone as the result of contracting the upper limit of the tessatura downward.   This distances the speaker in some way from what is said.    It can give any contour  perfunctory connotations.  Since the tempo can also be contracted (speeded up) or expanded (slowed down), a parenthesis is made by contracting both tessatura and tempo.   However, the two parameters can be expanded or contracted independently.  
     A final introductory word will advise the reader that, although many of the most insightful examples that follow are D. Bolinger's, the system is not his; for better or worse, it is the present writer's.

      One can begin by illustrating the assertiveness of a falling tail in a rising tangent--the unmarked or usual tune--and also the marked, counterassertive (protesting) tone, whose cadence  rises or falls.  We'll place an (unstressed) anacrusis at the beginning of each; and the latter example will end in a cadential rise.  Both examples are somewhat exclamatory and hence have an expanded tessatura:

                                     di
                   nev


       We'll       er say       e!
                             th
                       do
                 n't
       would                    t!
He                             a

These examples are Bolinger's.  Since never, die, wouldn't, and that are focused, their stressed syllables are the heads of the tunits in question.  The greater the vertical distance separating the tones, the more vivid the force of the assertion or protest in these examples--and conversely.  The result is to emphasize the force of the contour.  The pattern on the left is, without the expanded tessatura, the usual tune.   It's cadence signals finality, while the rise at the end of the example on the right, softens the protest, with the opposed force of non-finality.   Truncating (shortening) a drop reduces the finality of the drop to convey  a note of reluctance:

       Here follows a tune with three tunits, the second higher (more focused) than the other two:

       A smoothly descending tune is matter-of-fact; it can be used for a command whose fulfilment is more or less expected; note the finality of the cadence:

   Hand
               me
                      that
                               lit
                                   tle

                                         pen
                                               knife of yours.

[Because of the limitations of the present program, the syllables of the tail, " me that little," should drop only slightly from the head--Hand.]   The example is again Bolinger's.  The vertical break between little and pen indicates that the items on either side of the break are separate tunits.  The tune heard in this example is, as already observed, scandent.  Bolinger shows that the same example with subterflex tails is less insistent and therefore less peremptory--more polite--more of a request than an order.   

       Hand

                                          pen

                  me that little         knife of yours.

       There are two further ways of turning imperatives into  requests.  One is to truncate a final drop; the other is to add a rising cadence to the last tail.   (A longer rise will iconically signal greater non-finality than a shorter one.)  The two intonational devices can be combined, as in the second example following.  The use of a level tail in an imperative will be taken up next.

                     u
                       p.

       Hur
             ry
                u   p.
                  u

Hur
        ry

H
    u
       r
         r       p.
           y  u     
              u
Hur

                  p.
        ry    u

The first of these four examples (all of which have expanded tessaturas that intensify or make more vivid or emphatic what is  requested) sounds timid--the result of the reluctance or tentativeness signaled by the truncated drop.  While both heads are rising and both tails are falling, as is appropriate for imperatives, the result is more of an invitation than a command.  Note how up is glided down to include the tone of the missing tail syllable.  The second example indicates impatient pleading.   (The drop should be shown as less than this program permits.)  The rising head of up, the softened finality of the truncated drop, and the non-finality of the rising cadence create a pleading tone.  The peremptoriness of a command is altered into a request by the small drops of the tails of both tunits.  Instead of pleading  in the third example, the steep drop (lacking the reluctance or timidity signaled by truncation in the second example) convey peremptory force; its scandent form adds insistence; and de-focused up distances the speaker from the command.  The overall effect is to signal something like disgust.   The scandent tune has already been seen in the earlier imperative, "Hand me that little penknife of yours"--which, however, lacked a rising cadence.     The low head of the second tunit in our third example keeps the insistent note of this utterance from being menacing.   It is otherwise in the fourth example.   The expanded tessatura, two falling tails, and rising cadence expresses an exaggerated, almost (but, because of the rising tail, not quite) threatening tone of  impatience.   The impatience is due to the same contours as in the second and third examples, but the long drops on both tails plus the "something left unsaid" of the rising cadence combine to hint at consequences if the request is unheeded.  More can be said, but this will suffice.
      Perhaps the reader would like to interpret for herself the way the protesting rise is used on "right" in; the finality of falling "back" is very assuring:

                                                          ht
                                                        g
                                                     ri
                     prove                                 ba
      Home Im          ment will be                 ck!

      We can now turn to a consideration of the effects of level tails on an imperative; note how disinvolved the speaker seems to be in the next example, where the falling tune indicates matter-of-factness:

This would be uttered more slowly than a normal imperative and often with a bit of breathiness to signal mental fatigue.

      An example of a protest was given early on.  A few more contrasting ways of executing a protest will give the reader a better understandind of what is going on.  Here are three ways of saying a forced "It's beautiful"--evoked by a previous assertion that "Such and such is beautiful, isn't it?"--in a situation where one doesn't agree but can't afford to openly dissent:

It's is an anacrusis.  The tessatura can be expanded, to make the foregoing more vivid--or reduced, to make it sound more perfunctory.   The first example has a protesting rise followed by a finality cadence, not overly polite..    The second is similar but its politer in that its assertiveness is softened or made more conditional by the rising cadence that implies "but there's more to be said."  [The limitations of this program make the cadence look as though it rises higher than would actually be appropriate here.]   The third version involves what is called metatony:   The speaker de-focuses beaut- to a low tone in order to distance oneself from the statement; metatony transfers the expected high tone of this head to the preceding anacrusis.   Since  high anacrusis is protesting, the net effect is to sound like a very unwilling concession.    The statement could also be uttered with de-focus but without the rising tail; it would then be less polite-sounding.
        The same words can be made to sound very unenthusiastic by conveying the speaker's disinvolved, or lack of interest in contesting the idea, with level-toned tails:

The sentiment conveyed is "So what?"   A level tail signals greater reluctance than a truncated drop--but also less protest than a rising tail.  The second example is so defocused as to sound almost rudely perfunctory.
       Bolinger has given (e.g. in his own article in his Penguin volume on intonation) several examples of chained "inverse accents"--i.e. tunits with marked rising(-falling) tails.  The following example has a rising cadence signaling non-finality and-- since the cadence is probably combined with a rising tail of the final tunit--protest.   With that de-focused the way it is, the speaker is distancing oneself from the thought.  The high-toned anacrusis simply completes the tune's overall contour of implied protest or disagreement with respect to whatever that refers to.

       More needs to be said now about the implicative tune mentioned earlier.  The tessatura is often wide; a high head is followed by a low-toned tail and a rising cadence.  (Some heads in the low-toned part may be slightly raised.)  The effect can be exaggerated by gliding the head--a bit risingly to convey something like "and I was right" or a bit fallingly to convey something like "and I was wrong," as Bolinger cleverly points out.  The greater the pitch range of either glide, the stronger its connotation will be--a vivid case of the gradient nature of intonation.

The head, thought, is high because of its salient role in the whole sentence; what follows is low so as to de-focus and downplay the idea implied to be unlikely; the rising cadence implies that there's more that could be said, more that is being implied.
        The implicative tune is used for a typical question in Ireland.  With a significant modification, it is used for typical questions in England.  Aside from a more normal tessatura, the modification is that the tail is scandently falling rather than being simply de-focused.  The interrogative tunes in each variety of English simply pre-empt for interrogative use a tune that has got certain question-compatible overtones.  This can sometimes prove perplexing to speakers of another variety of English--who, e.g., might interpret an Irish English question as conveying some unwelcome implication.  The typical question in Southern States English simply transposes (this is the opposite of de-focusing) the England English question to high-toned syllables throughout.  The first of the following examples is from England; the second, from the Southern States.

[The defects of this program make the tessatura too wide; there should be a gradual fall throughout the precadence, and the cadence should  rise more scandently than portrayed here.]   This tune may sound too insistent or indeed matter-of-fact to speakers of other varieties of English.  While it lacks the connotations of disinvolvement signaled by the level tails in our earlier example, "Hand me that little pen-knife of yours,"   it can be made to sound very involved by expanding the tessatura and gliding the heads a bit.

[The final tail should not be as high as shown here.]  This Southern States question may sound like excitement to speakers of other varieties.  The Northern States interrogative typical question tune simply rises in a scandent pattern or in a jagged pattern; these patterns are the most unmarked interrogative pattern of all, indicating, as they do, inconclusivity.  The rising tune with the smooth envelope or the jagged envelope can be used for echo or reclamatory questions by speakers of other varieties.  Welsh English and Yiddish-accented English typically use what sounds to others like a protesting question tune--as though the questioner didn't believe what had been previously said:

Typical questions in Hawai'ian English employ something almost like the implicative pattern used for questions, though the cadence doesn't have to rise; the final tunit has a de-focused tail.  (The word order is not inverted in questions in Hawai'ian Creole.)  The effect of this tune on speakers of other varieties is that of a very strange way to ask a question.
       Bolinger gives another interrogative tune in the following, where reluctance to receive a confirmative reply is conveyed:

The subterflex tail of John--high-toned because of its being focused and salient--signals a truncated assertiveness that expresses some probability that it was John; the non-finality of the rising cadence expresses hope that such is not the case.  Note the distancing effect of de-focused, low-toned wasn't.   The high anacrusis resulting from metatony conveys protest.
         Unmarked or normal tag questions are rhetorical questions; cf. "She won, didn't she?" and "He didn't get hurt too badly, did he?"  They generally end in a falling cadence, since they don't pose genuine interrogations.  Marked tag questions (negative in neither part) have a rising cadence; if uttered with a contracted tessatura, they can have such an inappropriate de-focusing as to suggest, e.g., a minatory intent in the following example: 

[The defects of this program show higher rises than would actually be appropriate.]
        Alternative questions have rising heads for the first part; and, since one assumes that the second part will be true if the first isn't, the second part is uttered with falling heads and a falling cadence.  Echo and reclamatory questions like "They gave it to who?" and "The left it where?" (note that the WH-word is not moved to the clause-intital position) focus on the interrogative word at the end.  It will therefore glide upwards (this combines a protesting rise and a non-final cadence, since such questions are true questions); what precedes will be low-toned or scandently falling in order to exclude the rest of the sentence from the interrogation.   Embedded or indirect questions are treated like ordinary sentences--both syntactically (in most kinds of English, though with occasional word-order reversals in most varieties) and intonationally.

       While examples of transposition have been given already, a few further examples will provide additional illustrations of its effect.   Consider three of Bolinger's exclamatory examples:

It's unclear whether "I just" in the third example is a head (as is probable) or a high-toned anacrusis.  The tesstura is high in another of Bolinger's examples, whose import is to deny some prior statement: 

[The limitations of this program prevent showing that any is only slightly lower than wasn't.]  Anacrutic there and the focused word, wasn't, are high-toned for obvious reasons.  The overall contour of the tune expresses frustration with  de-focused trouble, which the speaker is distancing onself from, and with the non-finality of the cadence--which perhaps implies "You've got it all wrong."
       The following examples have metatony at the beginning; everything that follows is de-focused to distance the speaker from that (in the first example) or from the whole idea involved in it (in the second and third examples).

[A more adequate program would have the cadence gradually gliding up from ma-.]  Two further questions of this sort (the first replaces Bolinger's threaten with frighten) used by Bolinger are these:

The transposition of the first tunits in the first of these two examples emphasizes the utterance and makes it vivid; frighten is de-focused to distance the speaker from the idea; and the non-final rising cadence is interrogative.   [The cadence in both examples should rise a bit less dramatically and gradually--in the second, more  glidingly than in the foregoing portrayal.]  From what has already been said, the analysis of the second of the preceding examples should be more than obvious.

musicnotes.jpg (1046 bytes)

       Commas involve cadences and a preceding tunit.  They are indicated--depending on their force--by a drop, with a subterflex contour (appropriate, say, for the hypothesis clause of a normal conditional sentence), or with a de-focused core plus a rising tail.  Subordinate commas within a passage already set off by commas can be indicated by their having a shorter change in pitch (say, a shorter drop) than the superordinate commas.   Though this seems like a syntactic use of intonation, it at bottom signals an intentional force.

       It has already been mentioned that superfluous gliding can be added to the head of a tunit to exaggerate its effect.   Notice could in the following example of Bolinger's. 

[A more adequate program would show could gradually gliding down a bit and then up again.]   Could is high-toned because of its salience and indeed emphasis in the wide tessatura of this example; a falling cadence indicating finality signals that it is a rhetorical question.   Superfluous gliding sounds precious when overused.

       The level tail is used morphologically to link items in word-compounding;  where a word is compounded of more than two elements, the level tail fulfills the role of a hyphen, viz. to show where two subordinate elements go together.  Bolinger gives the dramatic examples that follow; the first head is the stressed syllable of American, viz. -mer-:

The lack of a drop on the unstressed tail -ican signals a hyphenation or link between American and history that is absent in the example on the left, where the tail drops to indicatge a sort of break.  Thus, the item on the left refers to an American who teaches history, while the example of the right refers to a teacher of American history--an American-history teacher.   Though most compounds have a stress pattern in which the first item is full-stressed and later items are mid-stressed, a special group of marked compounds (including names, compounds containing cardinal numbers or other quantifiers, and othera--like the type illustrated in God-man and bitter-sweet--discussed elsewhere by the present writer) reverse this stress pattern.  Even more dramatic is the reversal of the intonation pattern.  The writer has shown how a level tail in marked compounds plays a role that reverses its use in Bolinger's unmarked compound examples just portrayed:  The level tail fails to link, and a drop links, in these compounds.

Non-fallling -gle on the left is insistent; the compound means "only one roll."  But falling -gle in the example on the right gives the compound the sense of "a single-roll," contrasting with a double roll.

       Any analysis of intonation that is incompatible with two developmental facts is wanting--viz. the fact that infants learn basic  intonation before syntax, etc., and the fact that adolescents perfect intonation last of all.  (Yet, tone-deaf people can have a reasonably good intonation.)
       The worst possible approaches to intonation (1) treat it as holistic tunes not built up of tunits, (2) teach that these  tunes refer to emotions or syntax, and--worse yet--(3) offer lists of such tunes to be learned and memorized--in place of a true analysis.  Language doesn't work that way, as modern linguists know; listing constitutes a childishly unscientific approach to intonation or anything else except a dictionary-=-which is a list.  A fourth problem with most presentations is their non-iconic notation.  The notation, employed here, which is Bolinger's, portrays gradience, i.e. degrees of rises and falls that iconize a similar gradience in the sound waves.   Fifthly and finally, binary analyses that do not show gradience or even neutral values (see the system table at the beginning of this writing) are truly bogus.  Yet analyses with all five defects mentioned here prevail.

       AAAA
                  RRRR
                           GGGG
                                      HHHH!!!!

_________________________________________________________

NOTE

          1Gradient means that analytical/descriptive features like [low] need not be limited to three values [- low, x low, +low].  Rather [> low] means "still lower," [> low] means "less low" or "higher."  In the case of some features, it may be the middle value that is gradiant; instead of [x feature], the analysis would use [>x feature] or [<x feature] or simply leave off the "x" when the other features, + and -, are non-gradient.