Prosody (linguistics)
In linguistics, prosody is the study of elements of speech, including intonation, stress, rhythm and loudness, that occur simultaneously with individual phonetic segments: vowels and consonants. Often, prosody specifically refers to such elements, known as suprasegmentals when they extend across more than one phonetic segment.
Prosody reflects the nuanced emotional features of the speaker or of their utterances: their obvious or underlying emotional state, the form of utterance, the presence of irony or sarcasm, certain emphasis on words or morphemes, contrast, focus, and so on. Prosody displays elements of language that are not encoded by grammar, punctuation or choice of vocabulary.
Attributes of prosody
In the study of prosodic aspects of speech, it is usual to distinguish between auditory measures and objective measures. Auditory and objective measures of prosody do not correspond in a linear way. Most studies of prosody have been based on auditory analysis using auditory scales.Auditorily, the major prosodic variables are:
- pitch of the voice
- length of sounds
- loudness, or prominence
- timbre or phonatory quality
- fundamental frequency
- duration
- intensity, or sound pressure level
- spectral characteristics
Phonology
Prosodic features are suprasegmental, since they are properties of units of speech that are defined over groups of sounds rather than single segments. When talking about prosodic features, it is important to distinguish between the personal characteristics that belong to an individual's voice and the independently variable prosodic features that are used contrastively to communicate meaning. Personal characteristics that belong to an individual are not linguistically significant while prosodic features are. Prosody has been found across all languages and is described to be a natural component of language. The defining features of prosody that display the nuanced emotions of an individual differ across languages and cultures.Intonation
Some writers have described intonation entirely in terms of pitch, while others propose that "intonation" is a combination of several prosodic variables. English intonation is often said to be based on three aspects:- The division of speech into units
- The highlighting of particular words and syllables
- The choice of pitch movement
The exchange above is an example of using intonation to highlight particular words and to employ rising and falling of pitch to change meaning. If read out loud, the pitch of the voice moves in different directions on the word "cat." In the first line, pitch goes up, indicating a question. In the second line, pitch falls, indicating a statementa confirmation of the first line in this case. Finally, in the third line, a complicated fall-rise pattern indicates incredulity. Each pitch/intonation pattern communicates a different meaning.
An additional pitch-related variation is pitch range; speakers are capable of speaking with a wide range of pitch, while at other times with a narrow range. English makes use of changes in key; shifting one's intonation into the higher or lower part of one's pitch range is believed to be meaningful in certain contexts.
Stress
Stress functions as the means of making a syllable prominent. Stress may be studied in relation to individual words or in relation to larger units of speech. Stressed syllables are made prominent by several variables. Stress is typically associated with the following:- pitch prominence
- increased length
- increased loudness
- differences in timbre: in English and some other languages, stress is associated with aspects of vowel quality. Unstressed vowels tend to be centralized relative to stressed vowels, which are normally more peripheral in quality
When pitch prominence is the major factor, the resulting prominence is often called accent rather than stress.
There is considerable variation from language to language concerning the role of stress in identifying words or in interpreting grammar and syntax.
Tempo
Rhythm
Although rhythm is not a prosodic variable in the way that pitch or loudness are, it is usual to treat a language's characteristic rhythm as a part of its prosodic phonology. It has often been asserted that languages exhibit regularity in the timing of successive units of speech, a regularity referred to as isochrony, and that every language may be assigned one of three rhythmical types: stress-timed, syllable-timed and mora-timed. As explained in the isochrony article, this claim has not been supported by scientific evidence.Pause
or unvoiced, the pause is a form of interruption to articulatory continuity such as an open or terminal juncture. Conversation analysis commonly notes pause length. Distinguishing auditory hesitation from silent [|pauses] is one challenge. Contrasting junctures within and without [|word chunks] can aid in identifying pauses.There are a variety of "filled" pause types. Formulaic language pause fillers include "Like", "Er" and "Um", and paralinguistic expressive respiratory pauses include the sigh and gasp.
Although related to breathing, pauses may contain contrastive linguistic content, as in the periods between individual words in English advertising voice-over copy sometimes placed to denote high information content, e.g. "Quality. Service. Value".
Chunking
Pausing or its lack contributes to the perception of word groups, or [|chunks]. Examples include the phrase, phraseme, constituent or interjection. Chunks commonly highlight lexical items or fixed expression idioms. Chunking prosody is present on any complete utterance and may correspond to a syntactic category, but not necessarily. The well-known English chunk "Know what I mean?" in common usage sounds like a single word due to blurring or rushing the articulation of adjacent word syllables, thereby changing the potential open junctures between words into closed junctures.Functions
Prosody has had a number of perceptually significant functions in English and other languages, contributing to the recognition and comprehension of speech.Grammar
It is believed that prosody assists listeners in parsing continuous speech and in the recognition of words, providing cues to syntactic structure, grammatical boundaries and sentence type. Boundaries between intonation units are often associated with grammatical or syntactic boundaries; these are marked by such prosodic features as pauses and slowing of tempo, as well as "pitch reset" where the speaker's pitch level returns to the level typical of the onset of a new intonation unit. In this way potential ambiguities may be resolved. For example, the sentence "They invited Bob and Bill and Al got rejected" is ambiguous when written, although addition of a written comma after either "Bob" or "Bill" will remove the sentence's ambiguity. But when the sentence is read aloud, prosodic cues like pauses and changes in intonation will reduce or remove the ambiguity. Moving the intonational boundary in cases such as the above example will tend to change the interpretation of the sentence. This result has been found in studies performed in both English and Bulgarian. Research in English word recognition has demonstrated an important role for prosody.Focus
Intonation and stress work together to highlight important words or syllables for contrast and focus. This is sometimes referred to as the accentual function of prosody. A well-known example is the ambiguous sentence "I never said she stole my money", where there are seven meaning changes depending on which of the seven words is vocally highlighted.Discourse and pragmatic functions
Prosody helps convey many other pragmatic functions, including expressing attitudes, flagging turn-taking intentions, and marking topic structure, among others. For example, David Brazil and his associates studied how intonation can indicate whether information is new or already established; whether a speaker is dominant or not in a conversation; and when a speaker is inviting the listener to make a contribution to the conversation.Emotion
Prosody is also important in signalling emotions and attitudes. When this is involuntary, the prosodic information is not linguistically significant. However, when the speaker varies their speech intentionally, for example to indicate sarcasm, this usually involves the use of prosodic features. The most useful prosodic feature in detecting sarcasm is a reduction in the mean fundamental frequency relative to other speech for humor, neutrality, or sincerity. While prosodic cues are important in indicating sarcasm, context clues and shared knowledge are also important.Emotional prosody was considered by Charles Darwin in The Descent of Man to predate the evolution of human language: "Even monkeys express strong feelings in different tones – anger and impatience by low, – fear and pain by high notes." Native speakers listening to actors reading emotionally neutral text while projecting emotions correctly recognized happiness 62% of the time, anger 95%, surprise 91%, sadness 81%, and neutral tone 76%. When a database of this speech was processed by computer, segmental features allowed better than 90% recognition of happiness and anger, while suprasegmental prosodic features allowed only 44%–49% recognition. The reverse was true for surprise, which was recognized only 69% of the time by segmental features and 96% of the time by suprasegmental prosody. In typical conversation, the recognition of emotion may be quite low, of the order of 50%, hampering the complex interrelationship function of speech advocated by some authors. However, even if emotional expression through prosody cannot always be consciously recognized, tone of voice may continue to have subconscious effects in conversation. This sort of expression stems not from linguistic or semantic effects, and can thus be isolated from traditional linguistic content. Aptitude of the average person to decode conversational implicature of emotional prosody has been found to be slightly less accurate than traditional facial expression discrimination ability; however, specific ability to decode varies by emotion. These emotional have been determined to be ubiquitous across cultures, as they are utilized and understood across cultures. Various emotions, and their general experimental identification rates, are as follows:
- Anger and sadness: High rate of accurate identification
- Fear and happiness: Medium rate of accurate identification
- Disgust: Poor rate of accurate identification