The Challenges of Complex Morphology to Morphological Theory
Description
Morphology is undergoing a dramatic reconceptualization concerning objects
of inquiry, methodologies, and consequent hypotheses about theory
construction. The past few years have witnessed a dramatic increase in
morphological research from word-based and construction-theoretic
perspectives. This is based on detailed empirical investigation of numerous
lesser known languages with complex morphological systems. The papers in
this workshop showcase these new trends, including information-theoretic,
statistical, simulational techniques applied to complex,
typologically-diverse morphological and lexical systems.
This workshop will be open to anyone who wishes to participate; there is no registration process or fee. We hope to organize a group dinner for anyone interested; each person will pay his or her own costs. Information on the dinner will be available at the Workshop.
Organizers
- Farrell Ackerman, University of California, San Diego; fackerman AT ucsd DOT edu
- Mark Aronoff, Stony Brook University; maronoff AT stonybrook DOT edu
- James P. Blevins, Cambridge University, jpb39 AT cam DOT ac DOT uk
- Gabriela Caballero, University of California, San Diego; gcaballero AT ucsd DOT edu
- Alice C. Harris, University of Massachusetts Amherst; acharris AT linguist DOT umass DOT edu
- Robert Malouf, San Diego State University; rmalouf AT mail DOT sdsu DOT edu
Schedule
The program for the workshop can be downloaded here
Abstracts
The Low Entropy Conjecture and Modern Irish Nominal Declension
Farrell Ackerman (University of California, San Diego) and
Rob Malouf (San Diego State University)
Following traditional research on analogy and implicational relations in morphology (Paul 1890;
Anttila 1977; Kurylowicz 1995; Morpurgo-Davies 1988; Wurzel 1989; Itkonen 2005, among others),
recent research on the nature of paradigm organization in languages with complex morphology has
fruitfully employed conditional entropy measures from Information Theory (Moscoso del Prado Martin
2003; Moscoso del Prado Martin et. al. 2004; Ackerman et.al. 2009, among others). This has led to the
postulation of the Low Entropy Conjecture (Malouf and Ackerman 2010): morphological systems
display low conditional entropy in their conjugational and declensional patterns and this facilitates the
accurate prediction of novels forms of words from known forms of words.
The empirical basis for this conjecture has found support in the analysis of significant
subportions of verbal and nominal paradigms in languages with varying degrees of complexity (as
measured via the number of morphosyntactic properties associated with words and the types of
phonological exponence used to express them).
The present talk consolidates and explores our previous results by providing a comprehensive
analysis of Modern Irish noun inflection. In this talk we argue that the reconceptualization of
morphological theory can only advance by integrating detailed descriptive empirical research with
quantitative methodologies (see Blevins 2006: Bonami et. al. 2010; Sims 2010, for excellent studies).
In this research we utilize the Irish noun inflections and their patterns of declension as presented in
Carnie 2008. We begin by modelling and analyzing a core set of exemplary paradigms for
approximately 1,200 nouns. We then extend these results to the rest of the nomimal vocabulary as a
measure of the accuracy of prediction associated with the factors identified as diagnostic of class
membership or lexeme relatedness. Beyond the size and comprehensiveness of the data set, this
research departs in two additional respects from work we have already done. It focuses on more
veridical aspects of paradigm organization than we have previously entertained: in particular it
addresses the complex phonological factors associated with declension classes and it attempts to
address the impact of (type and token) frequency on entropy results. This is important since it provides
empirical evidence that bears on the logical deduction that the more real the modelling of the data
gets, the lower the entropies associated with morphological organization become.
An Information-Theoretic Analysis of the Estonian Inflectional System
James P. Blevins (University of Cambridge) and
Fermin Moscoso del Prado Martin (Centre National de la Recherche Scientifique)
The Estonian declensional and (to a somewhat lesser extent) conjugational systems are well known for their large form inventories and relatively complex patterns of inflection. Yet, as long recognized in traditional and pedagogical descriptions, the apparent complexity of these systems is offset by the extremely high degree of interdependence and interpredictability between forms. A detailed information-theoretical analysis that measures the mutual information between forms and cells brings out clearly how this structure reduces the entropy ('uncertainty') of the system to a level roughly comparable to that of ostensibly simpler inflectional systems.
Morphology and Phonology in Choguita Rarámuri Morphological Domains
Gabriela Caballero (University of California, San Diego)
The status of unconstrained affix order and multiple exponence, among other typologically relevant phenomena, is debated in morphological theory and form ideal testing grounds for determining the nature of the interface between different components of grammar. But to what extent do individual morphological phenomena bear any relation with other components of the grammar of individual languages? And are these phenomena characteristic of specific types of morphological systems? In this talk, I address these questions by analyzing specific morphological patterns and their interaction in Choguita Rarámuri (Tarahuamara; Uto-Aztecan) within the context of a larger study focusing on this language’s morphology and phonology. Choguita Rarámuri is a suffixing language with a complex morphological system characteristic of agglutinating languages (Plank 1999) including: i) mostly concatenative exponence, ii) potentially long string of suffixes, iii) virtually no flexivity, iv) zero exponence, v) little homonymous exponence, vi) large derivational paradigms, vii) multiple exponence, and viii) abundant optional marking (Caballero 2008). The structure of this language, however, departs from the canonical agglutinating type in that it has less transparent morpheme boundaries, due to a fair amount of phonological cohesion of exponents closer to the stem. This talk presents phonological and morphological evidence for morphological domains in Choguita Rarámuri and shows that: i) the constraints on variable suffix order, juncture effects of multiple exponence, outwardly conditioned allomorph distribution and morphological conditions on stress assignment in this language are intimately related to a hierarchical morphological structure; ii) words are built in a step- wise, inside-out fashion; and iii) different phonological constraint rankings are associated with different morphophonological subconstituents of the word.
Paradigm Shapes: Representation and Reality
Greville G. Corbett (Surrey Morphology Group)
The earlier chaos in the glossing of examples is being gradually reduced, as a result of linguists’ collective conscience and the availability of the Leipzig Glossing Rules. Perhaps it is now time to consider how we represent paradigms. We conventionally represent different features in different dimensions (with difficulties when there are more than two), and split or combine cells according to unspoken conventions about majorities of distinct forms within the lexeme and across lexemes. In one sense such representations are of secondary importance, compared with understanding the phenomenon; however, they can prove a real hindrance, and so deserve care. Turning to the reality of paradigms, there are instances which are truly challenging in their complexity. We look at one of these, found in Archi (Daghestanian). The difficulty is not in the vast scale of the paradigms, for which Archi is famous, but in a small part of the paradigm, where person interacts with gender and number. I shall lay out the issue in detail, showing the several ways in which the paradigm exhibits non-canonical behaviour, which is both hard to grasp and hard to represent.
Multiple Plural Exponence in Maay: An Optimality Theoretic Account
Minta Elsman (University of Massachusetts Amherst)
The Lower Jubba dialect of Maay (Paster 2006, 2010) presents an interesting case of plural marking on nouns that involves not only allomorphy of the plural suffix (realized as either –o or –yal), but also apparently free variation of the plural allomorphs as well as multiple exponence of the plural morpheme on consonant-final nouns:
A complete analysis of the above patterns must address the following questions:
- Distribution: why are vowel-final nouns restricted to the plural allomorph –yal , while consonant –final nouns can be pluralized by –o , -yal , or –o-yal ?
- Ordering: What governs the order of the plural allomorphs in (1c)? Why is yahas-o-yal a possibility while *yahas-yal-o is not?
- Multiple Exponence: Given that either –o or –yal is sufficient to mark plurality on consonant-final nouns, what motivates the double marking in (1c)?
This paper proposes an Optimality Theoretic (Prince and Smolensky 1993/2004) analysis of the Lower Jubba Maay data, in which the distribution and ordering of plural allomorphs is governed by phonological markedness constraints. The restriction of vowel-final roots to the –yal allomorph is attributed to a highly ranked ONSET constraint that requires all syllables to have an onset. Affixing a vowel-initial plural allomorph to a vowel-final root (2a, c) violates this constraint; therefore, vowel-final roots only combine with consonant-initial –yal . In contrast, consonant final-roots provide a potential onset for vowel-initial suffixes, allowing them to combine with any plural allomorph. The ordering of the plural allomorphs in doubly-marked forms is accounted for by alignment constraints (McCarthy and Prince 1993), which require the right and left edges of an affix to align with the right and left edges of a syllable, respectively. It is posited that in this dialect of Maay, right-edge alignment outranks left-edge alignment. This favors the order –o-yal , in which the right edge of each suffix aligns with the right edge of each syllable, over –yal-o , which given the high ranking of ONSET, is syllabified as [ya.lo], where the right edge of –yal does not align with the right edge of a syllable.
The existence of multiply marked forms such as yahas-o-yal emerges from the interaction of the above markedness constraints with constraints that require uniformity between related nominal forms (McCarthy 2005). Although doubling the plural suffix in a root+plural form does not improve phonological well-formedness, the double marking can be selected as optimal when the root+plural base is followed by another nominal affix (e.g. definite, possessive). It is argued here that through paradigm uniformity constraints, the phonologically motivated preference for double plural marking in suffixed plural bases influences a preference for multiple plural exponence in unsuffixed root+plural forms.
The Marginal Detraction Hypothesis: Evidence from French and Icelandic
Raphael Finkel (University of Kentucky) and
Gregory Stump (University of Kentucky)
In this paper, we discuss the Marginal Detraction Hypothesis,
according to which marginal
inflection classes (those with few member lexemes) tend to detract most strongly from the predictability
of other inflection classes. We elucidate this idea with reference to an inflection-class system’s
plat and its distillations.
A plat is a table such as (1), where V, W, X, Y and Z are distinct morphosyntactic
property sets and I, II, III, IV and V are distinct inflection classes, so that for any morphosyntactic
property set s in {V, W, X, Y, Z} and any inflection class c in {I, II, III, IV, V}, the
exponence of s in c is
the morphological marking occupying cell 〈s, c〉 in (1). (An exponence, in our terms, may subsume
more than one exponent; for instance, -o-d is the exponence of past tense in paradigm of TELL.) A
distillation is a set of morphosyntactic property sets whose patterns of exponence are isomorphic; thus,
in the hypothetical plat in (1), the morphosyntactic property sets V and Z belong to a single distillation,
since their patterns of exponence are isomorphic. The exponences of morphosyntactic property sets
belonging to the same distillation are interpredictable; thus, the predictability of the paradigms in a
plat is enhanced to the extent that its morphosyntactic property sets are grouped into a small number
of large distillations. We demonstrate that in the verb systems of French and Icelandic, marginal
inflection classes have the effect of pushing morphosyntactic property sets into a larger number of
smaller distillations; the effect of this property is to lower the general predictability of verb paradigms
in these languages. We conclude with a discussion of possible explanations for this striking empirical
finding.
Agreement Mismatches in Halkomelem
Donna Gerdts (Simon Fraser University)
Halkomelem Salish has a two-way gender system:
masculine and feminine. For human NPs, singular feminine nouns take feminine determiners and masculine nouns and all plurals take masculine.
Thus, masculine is the default gender. Inanimate NPs fall into two classes, those that take only masculine and those that take either masculine or feminine. I've previously explored the semantics of gender based on texts and elicitations, showing that gender on inanimates is a semantic-based system: objects that are small, round, flexible, fluid; containers; forces of nature; and diseases can take feminine determiners. I have also discussed factors that influence the choice between masculine versus feminine determiners. In this paper, I discuss gender agreement on sentence initial auxiliaries, which can optionally take a determiner element.
Some aspects of this phenomenon are accounted for by syntactic generalizations: auxiliaries agree with the gender of a human NP (or the first conjunct of a coordinated NP) in subject position. However, if the subject is a feminine inanimate, then mismatches between the gender on the auxiliary and the gender on the determiner of the subject NP are possible. Each instantiation of gender can independently default to masculine.
The Halkomelem data thus parallel results of my previous research on case agreement in several constructions in Korean. Agreement mismatches are important for morphological theory because they show that secondary agreement can not always be considered a process of syntactically-mediated feature copying or checking.
Exponent Adjacency in Multiple Exponence
Alice C. Harris (University of Massachusetts Amherst) and
Kevin Ryan (University of California, Los Angeles)
It has been observed that in multiple exponence (ME) two identical exponents are almost never
adjacent, and adjacent exponents are almost never identical (Inkelas & Caballero 2008; Caballero &
Harris, in press). It appears that these facts are epiphenomenal – they accompany and are caused by
the various origins of the multiple exponence. (i) In some patterns (for example, in Batsbi
(Nakh-Daghestanian), Hualapai (Yuman), Akebe (Kwa), Kinande (Bantu), and others), ME results from
grammaticalization of an element (for example, an auxiliary or determiner) bearing the same marker
borne by the historical head; as a natural consequence, the exponents are separated by their historical
hosts. (ii) The most common reason for two exponenents occurring in adjacent positions seems to be
that the first (closer to the root) is unproductive or irregular, while the second is productive and regular.
These will necessarily have different forms. Thus, the generalization fits examples that are products of
their own history.
From approximately 100 distinct patterns of multiple exponence (Caballero & Harris, in press),
the single clear exceptions we have identified to these generalizations are found in Dumi, as in (1).
In (1a, c), the portmanteau morph, -n, ‘1 singular acting on 2’ is repeated in adjacent positions, and in
(1b) the ‘non-first person dual’ morph -si is repeated (van Driem 1993: 129, 146). While Dumi presents
the only clear case of adjacent identical exponents, related phenomena from Svan (Kartvelian) and
Khinaliq (Nakh-Daghestanian) shed light on this problem. A solution is proposed in the framework of
adjacency bigrams, as proposed in Ryan 2010.
Affixes as Constraints? Evidence from Haplology
Sharon Inkelas (University of California, Berkeley)
Realizational approaches to phonology find a natural home in Optimality Theory, which makes
it possible to model affixation via phonological constraints on output form. The constraints take the
form XY, where "X" represents a morphological category and "Y" represents a required phonological
property of that category. For example, plural suffixation in English is modeled by the XY constraint
Noun.Plural:z#, stating that plural nouns must end in /z/.
The alternative approach in item-based theory would be to posit a lexical entry /-z/ which is
associated with the feature [Plural].
Item-based approaches to affixation have the virtue of treating affixation and compounding in
similar fashion, but leave process morphology out in the cold. Realizational approaches to affiixation
can handle affixation and process morphology, though have little to say about compounding. The
debate as to which approach is better has waged unresolved for a long time.
This paper offers support for the item-based view by exploring phonological and morphological
predictions of XY constraints, focusing on haplology and effects related to the Repeated Morph
Constraint of Menn & McWhinney (1984). XY constraints initially seem to offer a superior means of
accomplishing haplology effects, directly modeling Menn & McWhinney's insight that a single affix can
do double duty. For example, both the English noun plural and the possessive are realized as /-z/; a
form which is both plural and possessed exhibits only one /z/, e.g. tigers' (not *tigers's). If both Plural
and Possessive are associated with XY constraints (Noun.Plural:z# and Possessive.z#), then a form
like tiger's simultaneously satisfies both.
However, XY constraints turn out both to overgenerate and undergenerate in the domain of
haplology; the erroneous predictions are in the category of the 'too many repairs' problem. It is argued
that a superior approach casts affixes as items and treats haplology effects as just another type of
morphologically conditioned phonology.
Competing for Productivity: Self-Organization in the English Suffix System
Mark Lindsay (Stony Brook)
The affixes in a language may compete with each other for productivity because of three central aspects of the language system: the introduction of random elements, the propagation of an affix via productive derivation, and the intolerance of synonymy that can lead to the productive death of a less robust affix. In this talk, I investigate this emergent phenomenon, which parallels natural selection, in borrowed suffixes of English. These productive suffixes emerged out of whole-word borrowings from French, Latin, and Greek (Marchand 1969).
I first look diachronically at the emergence of productivity of ity, ment, and ation, following an earlier study by Anshen & Aronoff (1999). While ment and -ity had similar trajectories before the 17th century, their paths diverged from that point on, with -ment declining steadily; -ment is no longer productive today. The start of this decline coincides with a critical period when fewer new verbs were available; during this same century, words containing ation — a direct competitor to -ment — were being borrowed into English in far superior numbers.
Next, I examine rival suffixes ic and ical using data from Google Estimated Total Matches (ETM, commonly known as “hits”) to show that, while ic is more productive overall (8:1 ratio), ical is far more productive with stems ending in olog- (by a ratio of 7:1); this morphologically defined niche was able to form and sustain itself because the olog- subset is not only sufficiently large (475 members) but also has remarkably few neighbors.
Finally, I explore the domains of ize and ify in a manner similar to ic and ical. Although ize is preferred in the vast majority of words, ify is dominant in the phonologically defined domain of monosyllabic stems. However, this is not a pure dichotomy: most, but not all, monosyllabic stems take -ify, and as stem length increases, the likelihood of taking the -ify suffix drops off logarithmically.
Operational Exponence: Process Morphology in Harmonic Serialism
Robert Staubs (University of Massachusetts Amherst)
Morphological exponence is sometimes most naturally described by means of processes. Perhaps most striking are instances of subtractive truncation—for example, in Lardil the uninflected form of the noun is formed by the truncation of a final vowel (1; Wilkinson 1988). Such phenomena are easily described in process-based terms but are more problematic on an item basis. As another example, Spokane repetitives are formed by the infixation of e accompanied by glottalization of all the sonorant consonants in the word (2; Carlson 1980). Such arbitrary, global feature changes are quite unlike phonological processes but are found in these morphological contexts. Systems like Spokane are awkward to incorporate into, for example, an item-based account with phonological spreading, but have a natural account as process-based multiple exponence.
In this work I propose a new formalization of process morphology situated in Harmonic Serialism (e.g. McCarthy 2000) and thus tightly integrated with phonological theory. As a serial optimization framework Harmonic Serialism offers both a coherent analytic status for morphological processes (viz. as operations) and also a constraint-based optimization related directly to an Optimality Theory-like (Prince & Smolensky 1993) phonology. In the proposed theory all morphological exponence is taken to be operational (= processual) in nature. The consideration of two levels of morphological correspondence (Wolf 2008, Kimper 2009) with operations allows processes to participate in allomorphy, phonologically-conditioned blocking, and multiple exponence.
Lardil has a minimal word restriction—words must have at least two moras (= vowels), with shorter words augmented with a (3a). In cases where truncation would result in a sub-minimal word, augmentation is not used—truncation is simply blocked (3b). This theory offers an analysis of this type of blocking parallel to existing accounts for concatenative morphology.
The theory of operational exponence yields natural analyses of phenomena which are otherwise difficult to account for—both in their segmental effects (e.g. truncation and Salishan glottalization) and in their morphological structure (e.g. multiple exponence). In addition, the proposed grammatical model gives a coherent theoretical divide between phonological and morphological operations. I suggest that, given this divide, the distinct power of morphological and phonological operations is elucidated. Morphological operations are language-specific (and potentially quite strong) while phonological operations are (more) universal and comparatively weak.