Agreement Groups Analysis of Mother-child Discourse

We propose a distributional framework for analysing linguistic corpora. The analysis is based on groups of minimally contrasting utterances. Such groups can be considered as representing agreement relations. Agreement groups can be related to the notion of ‘frame' used in its various senses in the research literature: item-based phrases (Cameron-Faulkner et al. 2003, Stoll et al. 2009), frequent frames (Mintz 2003, Chemla et al. 2009, Wang and Mintz 2010), flexible frames (St. Clair et al. 2010). Since agreement groups provide a means of representing novel sentences on the basis of sentences already encountered, we tested to what extent they can account for novel utterances in a database. We used the Anne files from the Manchester corpus (Theakston et al. 2001) of the CHILDES database (MacWhinney 2000). It was examined to what extent the agreement groups at a given stage of development can account for the utterances of the immediately following 30-minute session. Agreement groups were extracted from the body of utterances encountered up to the test stage. Examining the data of approximately one year we found that at each developmental stage some 19% - 41% of the utterances of the new session were compatible with the agreement groups extracted from the previous sessions. This amounts to a 6% - 10.3% proportion of novel utterances having been compatible with some groups. The results were slightly improved when a "guessing" mechanism was added. Qualitatively, we also found that the formation of groups may support categorisation, and the actual emergence of grammatical agreement.

Keywords: agreement, categorisation, group formation, distributional analysis, language acquisition


Bannard, C., Matthews, D. (2008). Stored Word Sequences in Language Learning. Psychological Science, 19(3), 241-248.

Cameron-Faulkner, Th.,  Lieven, E., Tomasello, M. (2003). A construction based analysis of child directed speech. Cognitive Science. 27. 843-873.

Drienkó, L. (2004a). Agreement Mapping System Approach to Language. Journal of Language and Linguistics. Vol. 3. No. 1. 38-61.

Drienkó, L. (2004b). Outlines of Agreement Syntax. Journal of Language and Linguistics. Vol. 3. No. 2.  154-181.

Drienkó, L. (2009). A linguistic agreement mapping-system. Unpublished PhD dissertation, ELTE University, Budapest.

Finch, S., Chater, N., Redington, M. (1995). Acquiring syntactic information from distributional statistics. In: Levy, JP, Bairaktaris, D, Bullinaria, JA, Cairns, P, (eds.) Connectionist models of memory and language. (229 - 242). UCL Press: London.

Harris, Z. S. (1951). Methods in structural linguistics. Chicago, IL, US: University of Chicago Press.

Kiss, G. R. (1973). Grammatical word classes: A learning process and its simulation. Psychology of Learning and Motivation, 7, l-41.

MacWhinney, B. (2000). The CHILDES Project: Tools for analyzing talk. 3rd Edition. Vol. 2: The Database. Mahwah, NJ: Lawrence Erlbaum Associates.

Mintz, T. H. (2003). Frequent frames as a cue for grammatical categories in child directed speech. Cognition, Volume 90, Issue 1, pp. 91-117.  doi:10.1016/S0010-0277(03)00140-9

Pinker, S. (1979). Formal models of language learning Cognition, 7,  217-283.

Redington, M., Chater, N., Finch, S. (1998). Distributional Information: A Powerful Cue for Acquiring Syntactic Categories. Cognitive Science Vol. 22 (4) pp. 425-469.

St. Clair, M. C., Monaghan, P., Christiansen, M. H. (2010). Learning grammatical categories from distributional cues: Flexible frames for language acquisition. Cognition. Volume 116, Issue 3,  pp. 341-360.

Stoll, S., Abbot-Smith, K., Lieven, E. (2009). Lexically Restricted Utterances in Russian, German, and English Child-Directed Speech. Cognitive Science 33, 75-103.

Theakston, A. L., Lieven, E. V., Pine, J. M., Rowland, C. F. (2001). The role of performance limitations in the acquisition of verb-argument structure: an alternative account. J. Child Lang. 28(1):127-52.

Wang, H., Mintz, T. H. (2010). From Linear Sequences to Abstract Structures: Distributional Information in Infant-direct Speech. In  Jane Chandlee,  Katie Franich, Kate Iserman, Lauren Keil (eds.) Proceedings Supplement of the 34th Boston University Conference on Language Development.

Weisleder, A., Waxman, S. R. (2010).  What's in the input? Frequent frames in child-directed speech offer distributional cues to grammatical categories in Spanish and English.  J. Child Lang.  Nov; 37(5):1089-108. Epub. 2009 Aug 24.


Download full text of the article as PDF