Compound (linguistics)


In linguistics, a compound is a lexeme that consists of more than one stem. Compounding, composition or nominal composition is the process of word formation that creates compound lexemes. Compounding occurs when two or more words or signs are joined to make a longer word or sign. Consequently, a compound is a unit composed of more than one stem, forming words or signs. If the joining of the words or signs is orthographically represented with a hyphen, the result is a hyphenated compound . If they are joined without an intervening space, it is a closed compound. If they are joined with a space, then the result – at least in English – may be an open compound.
The meaning of the compound may be similar to or different from the meaning of its components in isolation. The component stems of a compound may be of the same part of speech—as in the case of the English word
footpath, composed of the two nouns foot and path—or they may belong to different parts of speech, as in the case of the English word blackbird, composed of the adjective black and the noun bird''. With very few exceptions, English compound words are stressed on their first component stem.
As a member of the Germanic family of languages, English is unusual in that even simple compounds made since the 18th century tend to be written in separate parts. This would be an error in other Germanic languages such as Norwegian, Swedish, Danish, German, and Dutch. However, this is merely an orthographic convention: as in other Germanic languages, arbitrary noun phrases, for example "girl scout troop", "city council member", and "cellar door", can be made up on the spot and used as compound nouns in English too.
For example, German Donau­dampfschifffahrts­gesellschafts­kapitän would be written in English as "Danube steamship transport company captain" and not as "Danube­steamship­transportcompany­captain".
The meaning of compounds may not always be transparent from their components, necessitating familiarity with usage and context. The addition of affix morphemes to words should not be confused with nominal composition, as this is actually morphological derivation.
Some languages easily form compounds from what in other languages would be a multi-word expression. This can result in unusually long words, a phenomenon known in German as Bandwurmwörter.
Compounding extends beyond spoken languages to include Sign languages as well, where compounds are also created by combining two or more sign stems.
So-called "classical compounds" are compounds derived from classical Latin or ancient Greek roots.
In many languages, including English, Spanish, Latin and German, all numbers greater than twenty that have more than one non-zero digit are written and spoken as compounds.

Formation of compounds

Compound formation rules vary widely across language types.
In a synthetic language, the relationship between the elements of a compound may be marked with a case or other morpheme. For example, the German compound Kapitänspatent consists of the lexemes Kapitän and Patent joined by an -s- ; and similarly, the Latin lexeme paterfamilias contains the archaic genitive form familias of the lexeme familia ; in the English word daisy, the saxon genitive was etymologically fossilized from the Old English compound dæġes ēage. Conversely, in the Hebrew language compound, the word בֵּית סֵפֶר bet sefer, it is the head that is modified: the compound literally means "house-of book", with בַּיִת bayit having entered the construct state to become בֵּית bet. This latter pattern is common throughout the Semitic languages, though in some it is combined with an explicit genitive case, so that both parts of the compound are marked, e.g.
Agglutinative languages tend to create very long words with derivational morphemes. Compounds may or may not require the use of derivational morphemes also.
In German, extremely extendable compound words can be found in many different domains. In the language of chemistry, for example, compounds can be practically unlimited in length, mostly because the German rule suggests combining all noun adjuncts with the noun as the last stem. German examples include Farb­fernsehgerät, Funk­fernbedienung, and the often quoted jocular word Donau­dampfschifffahrts­gesellschafts­kapitänsmütze, which can of course be made even longer and even more absurd, e.g. Donau­dampfschifffahrts­gesellschafts­kapitänsmützen­reinigungs­ausschreibungs­verordnungs­diskussionsanfang etc. According to several editions of the Guinness Book of World Records, the longest published German word has 79 letters and is Donau­dampfschiffahrts­elektrizitäten­hauptbetriebswerkbau­unterbeamten­gesellschaft , but there is no evidence that this association ever actually existed.
In Finnish, although there is theoretically no limit to the length of compound words, words consisting of more than three components are rare. Internet folklore sometimes suggests that lentokone­suihkuturbiinimoottori­apumekaanikko­aliupseerioppilas is the longest word in Finnish, but evidence of its actual use is scant and anecdotal at best.
Compounds can be rather long when translating technical documents from English to some other language, since the lengths of the words are theoretically unlimited, especially in chemical terminology. For example, when translating an English technical document to Swedish, the term "Motion estimation search range settings" can be directly translated to rörelse­uppskattnings­sökintervalls­inställningar, though in reality, the word would most likely be divided in two: sökintervalls­inställningar för rörelse­uppskattning – "search range settings for motion estimation".

Subclasses

Semantic classification

A common semantic classification of compounds yields four types:
  • endocentric
  • exocentric
  • copulative
  • appositional
An endocentric compound consists of a head, i.e. the categorical part that contains the basic meaning of the whole compound, and modifiers, which restrict this meaning. The compound word is a hyponym of the head. For example, the English compound doghouse, where house is the head and dog is the modifier, is understood as a house intended for a dog. Endocentric compounds tend to be of the same part of speech as their head, as in the case of doghouse.
An exocentric compound is a compound where the semantic category is not stated. Neither of its components is a head, and its meaning often cannot be transparently guessed from its constituent parts. For example, the English compound white-collar is neither a kind of collar nor a white thing. In an exocentric compound, the word class is determined lexically, disregarding the class of the constituents. For example, a must-have is not a verb but a noun. The meaning of this type of compound can be glossed as " whose B is A", where B is the second element of the compound and A the first. Other English examples include barefoot.
Copulative compounds are compounds with two semantic heads. These are commonly used to describe points on a gradual scale, such as yellow-green.
Appositional compounds are lexemes that have two attributes that classify the compound.
TypeDescriptionExamples
endocentricA+B denotes a special kind of Bdarkroom, smalltalk
exocentricA+B denotes a special kind of an unexpressed different semantic meaning Credhead, scarecrow
copulativeA+B denotes 'the sum' of what A and B denotebittersweet, sleepwalk
appositionalA and B provide different descriptions for the same referenthunter-gatherer, maidservant

Syntactic classification

Noun–noun compounds

All natural languages have compound nouns. The positioning of the words varies according to the language. While Germanic languages, for example, are left-branching when it comes to noun phrases, the Romance languages are usually right-branching.
English compound nouns can be spaced, hyphenated, or solid, and they sometimes change orthographically in that direction over time, reflecting a semantic identity that evolves from a mere collocation to something stronger in its solidification. This theme has been summarized in usage guides under the aphorism that "compound nouns tend to solidify as they age"; thus a compound noun such as place name begins as spaced in most attestations and then becomes hyphenated as place-name and eventually solid as placename, or the spaced compound noun file name directly becomes solid as filename without being hyphenated.
TypeDescriptionExamples
Spaced The words are not visibly connected in writing.place name, ice cream
HyphenatedA hyphen is used to join the words.place-name, hunter-gatherer
Solid When written, there is no space or intervening punctuation.placename, scarecrow

German, a fellow West Germanic language, has a somewhat different orthography, whereby compound nouns are virtually always required to be solid or at least hyphenated; even the hyphenated styling is used less now than it was in centuries past.
In French, compound nouns are often formed by left-hand heads with prepositional components inserted before the modifier, as in chemin-de-fer 'railway', lit. 'road of iron', and moulin à vent 'windmill', lit. 'mill -by-means-of wind'.
In Turkish, one way of forming compound nouns is as follows: yeldeğirmeni 'windmill' ; demiryolu 'railway'.
Occasionally, two synonymous nouns can form a compound noun, resulting in a pleonasm. One example is the English word pathway.
In Arabic, there are two distinct criteria unique to Arabic, or potentially Semitic languages in general. The initial criterion involves whether the possessive marker li-/la ‘for/of’ appears or is absent when the first element is definite. The second criterion deals with the appearance/absence of the possessive marker li-/la ‘for/of’ when the first element is preceded by a cardinal number.