The communicative function of ambiguity in language

Authors: Steven T. Piantadosi
Harry Tily
Edward Gibson
Date: October 8 2011
Journal:`Cognition`_ 122
Pages:280-291 (12)
Keywords:|a|_, `language processing`_, `information theory`_, rational design in language

Abstract

The goal of the present paper is to develop an explanation for |a|_ which makes fewer assumptions than previous work, and is more generally applicable.

The authors argue that:

  1. If context_ provides |i|_ about meaning, then an efficient `communication system`_ will not redundant specify information_ provided by the context_ and will be ambiguous_ out of context_.
  2. |A|_ allows for greater ease of `language processing`_ by permitting re-use of easy-to-process linguistic units. If reused such that they are always well-disambiguated by context and if disambiguation is cheap, using the system will require less overall effort.

The authors test predictions of (2) (that words and syllables which are most efficient will be preferentially re-used) in English, German, and Dutch, and confirm their hypotheses.

The author interpret their results as strong evidence for the view that |a|_ results from a pressure for efficient communication.

The authors conclude that their results explain the pervasiveness of |a|_ in language and show how |a|_ likely results from ubiquitous pressure for efficient communication.

.. _cognitive processes: `cognitive process`_.. _communication protocols: `communication protocol`_.. _information retrieval: `information science`_.. _redundant: redundancy_

Contents

1   Introduction

|A|_ is a pervasive phenomenon in language which occurs at all levels of linguistic analysis. The existence of |a|_ provides a puzzle for functionalist theories. Indeed, the existence of |a|_ has been argued to show that the key structures and properties of language have not evolved for purposes of `communication`_ or use. Here we argue that this perspective on |a|_ is exactly backward.

In Zipf's view, |a|_ fits within the framework of his unifying principle of least effort, and could be understood by considering the competing desires of the speaker and the listener. Zipf suggest that `natural language`_ would strike a balance between these two opposing forces of unification and diversification, arriving at a middle ground with some but not total |a|_. Zipf's hypothesis of the way |a|_ might arise from a trade-off between speaker and hearer pressures has certain shortcomings. One example of that illustrates this trade-off is the NATO phonetic alphabet.

Beyond Zipf, several authors have previously discussed the possibility that |a|_ is a useful feature for a `communication system`_.

2   Two benefits of ambiguity

In this section, we present two similar arguments that efficient `communication systems`_ will be ambiguous when context_ is informative about meaning.

Both assume that:

  1. |I|_ is typically present to resolve ambiguities
  2. Disambiguation is not prohibitively costly; using information from the context_ to infer which meaning was intended does not substantially impede comprehension.

2.1   Ambiguity in general communication

In this section we motivate an information-theoretic view of |a|_. We argue that when context_ is informative, any efficient `communication systems`_ will leave out |i|_ already in the context_ and therefore necessarily appear ambiguous_ when examined out of context_.

No assumptions are made about the linguistic system, or the distribution of contexts or meaning, nor what the contexts or meanings actually are, and therefore the argument applies at all levels of linguistic analysis.

A key assumption that is required is that speakers and listeners have the same or very similar coding schemes and also the same ability to use contextual information to constrain the possible meanings.

It is unclear how one might test this argument, since it is a mathematical demonstration that |a|_ should exist; it does not make predictions about language other than the presence of |a|_.

2.2   Ambiguity and minimum effort

In this section, the authors argue that ambiguity is a desirable property of a linguistic systems because it potentially allows for ease of processing.

Ease of processing may be improved if linguistic form vary in ease of processing, there are at least two meanings which are unlikely to occur in the same context, and the cost of disambiguating is cheap. In this case, an unambiguous system could be made more efficient by mapping the meanings onto whichever of their corresponding linguistic forms was easiest to process.

2.2.1   Empirical evaluation of ambiguity and effort

In this section we empirically evaluate the prediction of the second argument that |a|_ allows for re-use of efficient linguistics units by looking at homophony_, polysemy, and the |a|_ about meaning of different syllables, in English, German, and Dutch.

Our basic approach is to measure the difficulty of words and syllables and see if easier linguistic units are preferentially re-used in language.

We use three measures of linguistic difficulty for word and syllables: length, frequency, and phonotactic surprisal. [*]

We measure re-use by measuring the number of possible meanings a word or syllable has. [†]

We then use several different techniques to analyze the influence of these factors on |a|_.

2.2.1.1   Homophony

Question:

Are phonological forms reused as a function of difficulty?

Prediction:

Easier phonological forms should be reused more often than harder phonological forms, across languages.

Experiment:
  • Word length was measured by syllables.
  • Word frequencies were taken from CELEX and were transformed to negative log probabilities.
  • Phonotactic surprisal was computed using a simple triphone language model. This measure was divided by word length to prevent it being collinear with length, and therefore can be interpreted as surprisal per phoneme, averaged over the entire word.
  • Number of homophones was taken from CELEX.
Observations:
/static/images/piantadosi_tily_gibson_2012_fig_1.png
Results:

Prediction confirmed. [‡] [§]

2.2.1.2   Polysemy

Question:

Are word forms reused as a function of difficulty?

Predictions:

Easier word forms should be reused more often than harder word forms, across part of speech. [¶]

Experiment:
  • The length of a word form was measured as phonological length.
  • Frequency of a word form was computed as in the homophony analysis.
  • Phonotactic surprisal was computed as in the homophony analysis.
  • The number of word senses_ was taken from WordNet.
Observations:../static/images/piantadosi_tily_gibson_2012_fig_2.png
Results:

Prediction confirmed.

2.2.1.3   Syllables

Question:

Are syllables reused as a function of difficulty?

Prediction:

Easier syllables should be reused more often than harder syllables, across languages.

Experiment:
  • The length of a syllable was measured as the number of phones in its phonological transcription.
  • Syllable frequencies and phonotactic log probabilities were computed using the same procedures as the previous two sections.
  • Phonotactic surprisal?
  • Reuse was measured as the number of words a syllable appears in.
Observations:../static/images/piantadosi_tily_gibson_2012_fig_3.png
Results:

Syllables pattern similarly to words, except in the case of phonotactic predictability. [#]

Conclusion:

Predictors of ease extend to syllable units, although not in the case of German syllable length.

3   General discussion

4   Conclusion

5   Footnotes

[*]

Both frequency and length are know to influence on-line language processing with longer and lower-frequency words taking longer to process.

Intuitively, words that are re-used through ambiguity should have very low phonotactic surprisal in order to decrease cognitive and articulatory difficulty.

While we only examined these three predictors, our theory predicts that any other measure which increases processing ease should also increase ambiguity.

[†]Ideally, one would measure |a|_ using the `entropy`_ over meanings for a given linguistic form. Unfortunately `entropy`_ is difficult to estimate without statistical bias, which leads to results which are difficult to interpret.
[‡]This is somewhat difficult to interpret because phonological forms with more meaning should be seen more simply because they can be used in more situations. However, that interpretation predicts a linear relationship between number of meanings and frequency-- a word k meanings should be used k more times than word with 1 meaning. The figure demonstrates a linear relationship between number and log frequency, corresponding to a super-linear relationship between number of homophones. We therefore argue such a relationship likely results from the ease of processing more frequent word-forms, rather than merely the fact that phonological forms with more meanings can be used in more situations.
[§]This effect on phontactic surprise tends to level out, showing no differences between the highest surprisal words or slight increases. These effects may result from poorer estimation in the highest phonotactic surprisal words, which have the lowest frequency phonotactic trigrams.
[¶]We chose to look at part of speech categories separately to ensure the finding are not driven by a single part of speech category and also to check that these effects go beyond effects of homophony.
[#]

The syllables with lowest phonotactic surprisal do appear in the most word; however, very high phonotactic surprisal syllables also tend to appear in many words.

We believe this trend is an artifact of our phonotactic surprisal model, which has increased estimation error for the high phonotactic surprisal. This interpretation is supported by the absence of a quadratic trend using a two-phone model.

Alternatively, it may be the case that other articulatory effects are present at the syllable level and that this trend results from other kinds of articulatory constraints (which exert a stronger influence).

[♠]We note, however, that the languages tested are historically-related, meaning that further work will be needed to establish stronger typological generalizations.

6   Glossary

Context
discourse context, world context, world knowledge, syntactic information, etc.
Functionalist theory

A theory which attempts to explain properties of linguistics systems in terms of `communicative pressures`_.

See: functionalism

Communicative pressure
Any cause that reduces communicative success in a proportion of a population, potentially exerts communicative pressure.
CELEX
A particular lexical database.
Phonotactic surprisal

Phonotactic surprisal refers to how phonetically probable a word is, given all other words in the language (measurable using a Markov model).

A word with low phonotactic surprisal may be called "phonotactically well-formed".