Prosodic bootstrapping

Prosodic bootstrapping in linguistics refers to the hypothesis that learners of a primary language use prosodic features such as pitch, tempo, rhythm, amplitude, and other auditory aspects from the speech signal as a cue to identify other properties of grammar, such as syntactic structure. Acoustically signaled prosodic units in the stream of speech may provide critical perceptual cues by which infants initially discover syntactic phrases in their language. Although these features by themselves are not enough to help infants learn the entire syntax of their native language, they provide various cues about different grammatical properties of the language, such as identifying the ordering of heads and complements in the language using stress prominence, indicating the location of phrase boundaries, and word boundaries. It is argued that prosody of a language plays an initial role in the acquisition of the first language helping children to uncover the syntax of the language, mainly due to the fact that children are sensitive to prosodic cues at a very young age.

Argument for

The argument for prosodic bootstrapping was first introduced by Gleitman and Wanner, who observed that infants might use prosodic cues to discover underlying grammatical information about their native language. These cues could aid infants in dividing the speech input into different lexical units, and furthermore aid in placing these units into syntactic phrases appropriate to the language.
Prosodic bootstrapping may also provide an explanation to the problem as to how infants segment continuous input. Just like adult speakers, children are exposed to continuous speech. Hearing continuous speech poses a problem for children learning their native language because pauses in speech do not align with word boundaries. As a result, children have to construct word representations from the speech that they hear.
A study conducted by Christophe et al. showed that infants, aging three-days old, are sensitive to acoustic properties of a language. It was shown that three-day olds are able to discriminate bisyllabic stimuli with the same segments based on whether they were extracted from within a word or across a word boundary. The duration of the word initial consonant and the word final vowel are the cues for the existence of a word boundary, which infants may use to learn about syntactic structure.
Another main support for the prosodic bootstrapping hypothesis is that the use of prosodic elements to segment parts of speech can occur at a very early age, as early as 3 days, where infants have shown the ability to differentiate languages based on phonological characteristics alone, and the fact that the use of prosodic cues occurs before the use of lexical or syntactic data. This has led to hypothesis of "bootstrapping from the signal"/"prosodic bootstrapping", which has three main elements:

The syntax of language is correlated with acoustic properties.
Infants can detect and are sensitive to these acoustic properties.
These acoustic properties can be used by infants when processing speech.
Phonological phrases

A phonological phrase boundary indicates how the continuous speech stream is broken up into smaller units, which infants use to pick out and more closely identify individual parts of the sentence. A phonological phrase can contain between four and seven syllables, and can be detected by infants, due to the fact that the edges of the phrases are either strengthened or lengthened. Various studies have been done to test if prosody helps with acquisition of syntax, morphology, and phonology.
Another acoustic cue that indicates a prosodic boundary is the duration of a pause. These pauses will usually be longer in duration at the edge of a word boundary, when referring to clause boundaries. For example, the two sentences below, while seemingly similar on the surface representation, have different prosodic structure, which correlates to the different syntactic structure :

"The boy met the girl at the teach in" → _NP... _VP... _PP
"The boy met the girl and the teacher" → _NP... _VP

Using different durations of pause, the underlying syntactic structure can be better distinguished by the listener.

Acquiring lexicon

For infants who are learning their native language, it is difficult to extract words from speech waves because pronounced words are not separated by silence. There are several proposals for lexical acquisition. The first is that children hear words in isolation: if a new piece goes between two words that are known, the new piece must be a new word. The second proposal is that there are some cues in the speech that give signal to the presence of a word boundary: duration, pitch, energy.
The fact that speech is presented in a continuous stream without pause only makes the task of acquiring a language more difficult for infants. It has been proposed that prosodic features such as the strength of certain sounds, relative to their location in the word, can be used to break apart and identify fragments within the speech stream, in order to differentiate between potentially ambiguous sentences. In English for example, the final in the word "bold" tends to be "weak", in that it is not fully released. On the other hand, an initial in a word such as "dime" is more clearly released, opposed to its word-final counterpart. This difference in strong v. weak sounds may help to better identify where the sound occurs in the word, whether at the beginning or the end.
Studies have shown that phonological boundaries can be interpreted as word boundaries, which further aids the child in the task of developing a lexicon. For example, Millotte et al. tested 16-month olds, observing how children use phonological phrase boundaries to constrain lexical access. When infants heard a prosodic boundary, they were able to detect the existence of a word boundary. In the experiments authors used the conditioned head-turn procedure which showed that when infants were trained to turn their heads for a bisyllabic word, they responded to sentences that contained this word more often than to those that contained both syllables of this word, but separated by a phonological phrase boundary.
Because prosodic boundaries will never occur inside of a word, thus infants will not be constrained in how they identify words in the speech signal. For example, children can differentiate between words such as "dice" and "red ice", even though both are phonologically similar. This is because a prosodic boundary will not appear in the middle of the word * but around the word instead.
Children use phonological phrase boundaries to constrain lexical access. They infer the existence of a word boundary given a prosodic boundary. If two sequences differ in prosody while being made up of identical segments, children treat them as different sequences. Studies that measured cues from prosody to phonological phrases have been done in a variety of languages that differ from each other, providing support that phonological phrases could possibly aid in acquiring lexicon universally.

Acquiring syntax

In addition to helping to identify lexical items, a key element of prosodic bootstrapping involves using prosodic cues to identify syntactic knowledge about the language. Because prosodic phrase boundaries are correlated to syntactic boundaries, listeners can determine the syntactic category of a word, using only prosodic boundary information. Christophe et al. demonstrated that adults could use prosodic phrases to determine the syntactic category of ambiguous words. Listeners were provided two sentences with an ambiguous word , which could either belong to a verb category or a noun category.

Category	Sentence	Translation
Verb
Noun

The table above depicts the two sentences heard by French-speaking adults in Christophe et al., where the emboldened word is the phonetically ambiguous word, and the brackets represent phonological phrase boundaries. Using the position of the prosodic boundaries, adults were able to determine which category the ambiguous word belonged to, since the word is assigned to a different phonological phrase, depending on its syntactic category and semantic meaning in the sentence.
An important tool for acquiring syntax is the use of function words to point out syntactic constituent boundaries. These function words frequently occur in language, and generally appear at the borders of prosodic units. Because of their high frequency in the input, and the fact that they tend to have only one to two syllables, infants are able to pick out these function words when they occur at the edges of a prosodic unit. In turn, the function words can help learners determine the syntactic category of the neighboring words. For example, in the sentence "The turtle is eating a pigeon", through the use of function words such as "the" and the auxiliary verb "is", children can get better sense as to where prosodic boundaries fall, resulting in a division such as , where brackets indicate a boundary. As a result, infants tend to look out for these words to better identify the beginnings and ends of the prosodic units. Noun articles like "the" or "a", in English for example, can only be followed by noun, since they are the only words that can fit this category; one would never hear a sentence such as "The *destroy was widespread". Likewise, the use of verb morphemes indicate that a verb must precede it, and indicate that no other word can fill the category besides a verb.
In a study by Carvalho et al., experimenters tested preschool children, where they showed that by the age of 4 prosody is used in real time to determine what kind of syntactic structure sentences could have. The children in the experiments were able to determine the target word as a noun when it was in a sentence with a prosodic structure typical for a noun and as a verb when it was in a sentence with a prosodic structure typical for a verb. Children by the age of 4 use phrasal prosody to determine the syntactic structure of different sentences.

Prosodic bootstrapping

Argument for

Phonological phrases

Acquiring lexicon

Acquiring syntax