It is often posited that more predictable parts
of a speaker’s meaning tend to be made less explicit, for instance using shorter, less informative words. Studying these dynamics in the domain of referring expressions has proven difficult, with existing studies, both psycholinguistic and corpus-based, providing contradictory
results. We test the hypothesis that speakers
produce less informative referring expressions
(e.g., pronouns vs. full noun phrases) when the
context is more informative about ...
It is often posited that more predictable parts
of a speaker’s meaning tend to be made less explicit, for instance using shorter, less informative words. Studying these dynamics in the domain of referring expressions has proven difficult, with existing studies, both psycholinguistic and corpus-based, providing contradictory
results. We test the hypothesis that speakers
produce less informative referring expressions
(e.g., pronouns vs. full noun phrases) when the
context is more informative about the referent,
using novel computational estimates of referent predictability. We obtain these estimates
training an existing coreference resolution system for English on a new task, masked coreference resolution, giving us a probability distribution over referents that is conditioned on the
context but not the referring expression. The
resulting system retains standard coreference
resolution performance while yielding a better
estimate of human-derived referent predictability than previous attempts. A statistical analysis of the relationship between model output
and mention form supports the hypothesis that
predictability affects the form of a mention,
both its morphosyntactic type and its length.
+