Entropy Indicators for Investigating Early Language Processes

Lyon, C., Nehaniv, C.L. and Dickerson, B. (2005) Entropy Indicators for Investigating Early Language Processes. The Society for the Study of Artificial Intelligence and the Simulation of Behaviour (AISB).
Copy

We examine evidence for the hypothesis that language could have passed through a stage when words were combined in structured linear segments and these linear segments could later have become the building blocks for a full hierarchical grammar. Experiments were carried out on the British National--Corpus, consisting of about 100 million words of text from different domains and transcribed speech.--This work extends and supports the results of our previouswork based on a smaller corpus reported previously. Measuring the entropy of the texts we find that entropy declines as words are taken in groups--of 2, 3 and 4, indicating that it is easier to decode words taken in short sequences rather than individually. Entropy further declines when punctuation is represented, showing that appropriate segmentation captures some of the language structure. Further support for the hypothesis that local sequential processing underlies the production and perception of speech comes from neurobiological evidence. The observation that homophones are apparently ubiquitous and used without confusion also suggests that language processing may be largely based on local context.

picture_as_pdf

picture_as_pdf
901888.pdf

View Download

Atom BibTeX OpenURL ContextObject in Span OpenURL ContextObject Dublin Core MPEG-21 DIDL EndNote HTML Citation METS MODS RIOXX2 XML Reference Manager Refer ASCII Citation
Export

Downloads