Since there are no systematic pauses delimiting words in speech, the problem of word
segmentation is formidable even for monolingual infants. We use computational
modeling to assess whether word segmentation is substantially harder in a bilingual
than a monolingual setting. Seven algorithms representing different cognitive
approaches to segmentation are applied to transcriptions of naturalistic input to young
children, carefully processed to generate perfectly matched monolingual and bilingual
corpora. ...
Since there are no systematic pauses delimiting words in speech, the problem of word
segmentation is formidable even for monolingual infants. We use computational
modeling to assess whether word segmentation is substantially harder in a bilingual
than a monolingual setting. Seven algorithms representing different cognitive
approaches to segmentation are applied to transcriptions of naturalistic input to young
children, carefully processed to generate perfectly matched monolingual and bilingual
corpora. We vary the overlap in phonology and lexicon experienced by modeling
exposure to languages that are more similar (Catalan and Spanish) or more different
(English and Spanish). We find that the greatest variation in performance is due to
different segmentation algorithms and the second greatest to language, with
bilingualism having effects that are smaller than both algorithm and language effects.
Implications of these computational results for experimental and modeling approaches
to language acquisition are discussed.
+