Evolution of Language

3^rd Conference
The Evolution of Language
April 3rd - 6^th , 2000

Abstracts

Evolution of symbolisation in chimpanzees
and neural nets

Angelo Cangelosi

Centre for Neural and Adaptive Systems
University of Plymouth (UK)
a.cangelosi@plymouth.ac.uk

Introduction

Animal communication systems and human languages can be characterised by the type of cognitive abilities that are required. If we consider the main semiotic distinction between communication using icons, signals, or symbols (Peirce, 1955; Harnad, 1990; Deacon, 1997) we can identify different cognitive loads for each type of reference. The use and understanding of icons require instinctive behaviour (e.g. emotions) or simple perceptual processes (e.g. visual similarities between an icon and its meaning). Communication systems that use signals are characterised by referential associations between objects and visual or auditory signals. They require the cognitive ability to learn stimulus associations, such as in conditional learning. Symbols have double associations. Initially, symbolic systems require the establishment of associations between signals and objects. Secondly, other types of relationships are learned between the signals themselves. The use of rule for the logical combination of symbols is an example of symbolic relationship. Symbolisation is the ability to acquire and handle symbols and symbolic relationships.

Symbolisation in chimpanzees

A great deal of research exists regarding the study of symbolisation in humans. Language is considered to be a prototypical example of human ability to learn and use symbols. However, when we look at the evolutionary roots of symbolisation and language, e.g. with animal experiments, many studies have investigated the general ability of different animal species to acquire human-like languages, as opposed to focusing on symbolisation. Some experiments on language acquisition in chimpanzees have specifically investigated the evolution of symbolisation in apes (Savage-Rumbaugh, 1986). In these studies, researchers made a clear and operational distinction between non-symbolic and real-symbolic language learning strategies. Non-symbolic linguistic strategies use simple conditional associations to link signals and objects. Alternatively, real symbolic languages are based on the acquisition of symbolic relationships for communication, and the decontextualisation of language from the restricted learning stimulus set. In Savage-Rumbaugh & Rumbaugh (1978) chimpanzees are trained to learn a set of lexigrams (pictures in a keypad) to communicate about foods and drinks. Animals first learn the lexigram of the individual foods and drinks, such as banana and orange, milk and coke. Subsequently, they are taught the lexigrams for two actions ("pour" for the drinks only, "give" for the foods only), together with the individual food/drink lexigram (e.g. "pour-milk", "give-banana"). Animals successfully learned these lexigrams after a systematic training cycle. Savage-Rumbaugh & Rumbaugh also devised a test for symbolisation. They wanted to ascertain if the linguistic stimuli learned by the animals were used in a real symbolic way (e.g. identifying the logical rule to associate the lexigram "pour" with all drinks, but not any of the solid foods) or if the animals were simply associating the whole pair "pour-milk" to the event of pouring milk. They taught the chimpanzees the lexigrams for the names of new foods and drinks and checked if the animal was able to generalise the rule and associate the correct action lexigrams with the new name lexigrams. The test results showed that only some of the chimpanzees were able to make a correct rule generalisation. Other chimpanzees had to be retrained to learn the new pairs of action-name lexigrams. Savage-Rumbaugh et. al. (1980) presented similar results for a test on the use of lexigrams for classifying tools and foods. Other studies (Greenfield & savage-Rumbaugh, 1990) have shown that during the spontaneous learning of lexigram use in baby chimpanzees, some animals invented symbolic structures, such as the one resembling the "action-object" syntactic rule.

This experimental data suggests that apes can successfully learn symbolic relationships. However, this learning is only obtained under certain experimental conditions that, for example, stress the pragmatic aspects of communication during language acquisition. These experiments are lengthy and complex, but they are useful in the acquisition of symbolisation abilities in apes. They also indicate that animals can use symbols in ways that emulate human language without comprehending their representational function (Savage-Rumbaugh et. al., 1980). For Deacon (1997) this evidence contributes to the explanation of the gap between animal communication systems and human language. Deacon also suggests that animals, even apes, have great difficulties in learning symbolic relationships because of differences in the structure and function of their brain, in particular in the prefrontal cortex areas.

Computational models for symbolisation: Evolving neural networks

Artificial neural nets are computational models that are inspired by the function and structure of biological neural systems. Currently, they are used for modelling cognition. However, using neural net models to study symbol acquisition is still a controversial subject. Some researchers are very sceptical (e.g., Fodor & Pylyshyn, 1988; Marcus, 1998), whilst others support the use of neural nets for cognitive tasks requiring symbolisation (e.g., Rumelhart & McClelland, 1986). Recently, the integration of genetic algorithms as a model of evolution and neural nets for cognitive modelling, has been proposed for the study of the evolution of communication in populations of artificial organisms (Cangelosi & Harnad, in press). This method is part of the synthetic approach to the modelling of language evolution (Steels, 1997; Kirby, 1999). This paper uses a model of the evolution of communication (Cangelosi, 1999) to study symbolisation and symbol acquisition in neural nets. Such nets represent the organisms' cognitive systems that control behaviour and communication. The simulated tasks resemble that of Savage-Rumbaugh & Rumbaugh's (1978) chimpanzee experiments. This work proposes a complementary approach for the study of the evolution of symbolisation through computational modelling. Computer models will allow us to simulate and expand the experimental settings used in lengthy animal experiments. For example, neural nets can be used to test some of Deacon's (1997) hypotheses on the co-evolution of the brain, language and symbolisation.

Model setup

The model setup is directly inspired by the ape language experiments. A population of 80 artificial organisms perform a foraging task by collecting edible mushrooms, whilst avoiding poisonous mushrooms (toadstools). The organisation of foraging task stimuli into a hierarchy of functional categories was derived from Savage-Rumbaugh & Rumbaugh's (1978) experiments. Our hierarchy consists of 2 high-level categories (edible and poisonous mushrooms) and 3 low-level categories (large, medium, and small mushrooms). Organisms will learn to name each of the three edible subcategories ("large edible", "medium edible", and "small edible") and a common verb for the high-order edible category, i.e. "approach". Each of the three toadstool subcategories ("large poisonous", "medium poisonous", and "small poisonous") require the use of the same verb, i.e. "avoid". The organisms' fitness and reproduction depend upon the number of edible foods correctly collected minus the number of toadstools collected. At each generation the 20 organisms with the highest fitness are selected and asexually reproduce 4 offspring each. The organism’s genotype is the connection weight matrix of its neural net. New offspring are subject to a 10% random mutation of their weights. During the first 300 generations, organisms evolve the ability to discriminate between the 6 types of mushrooms (3 edible and 3 poisonous). From generation 301 organisms are able to communicate using 8 linguistic input/output units to describe mushrooms. Organisms learn to label mushrooms using the backpropagation algorithm. The teaching input is provided from their parents.

A 3-layer feedforward neural net controls the behaviour of the organism. In the input layer 18 units encode the perceptual features of the closest mushroom, 3 units encode its location, and 8 units encode the 8 symbols available for communication. The hidden layer has 5 units. In the output layer 3 units are used to control the organism’s behaviour (movement and action depending on mushroom size), and 8 units are used to produce the communication symbols. Symbolic output units are organised in two winner-takes-all clusters of competitive units (one cluster of 6 units, one of 2).

Results

The simulation of the model was repeated 10 times, starting from different random populations. At generation 300, the fitness in 9 out of 10 replications increased to an optimal level. These 9 successful populations were used to evolve communication from generation 301 to 400. In approximately half of the replications, organisms evolved an optimal lexicon, i.e. the use of at least 4 symbols/symbols-combinations to distinguish 4 types of mushrooms (the toadstools + the three types of edible mushroom) (detailed description of the model's results can be found in Cangelosi, 1999). In the remaining populations, some mushrooms were incorrectly labelled and classified due to the lack of a specific symbol. Note that the majority of successful simulations evolved languages that used combinations of symbols, and in particular some evolved the "verb-noun" structure. Two different "verb" symbols were used respectively for toadstools and edible mushrooms. The other symbol is used to distinguish between the three subcategories of mushrooms. Figure 1 shows the charts of an evolved "verb-noun" language. Note that the two "verb" symbols ("Y" and "Z") emerge in the early stages of language evolution and then stabilise. The names for the mushroom subcategories are subject to continuous change and only at the last generation they reach a stable and optimal point.

generation 300 generation 400

Figure 1: Structure of evolved language at generation 300 and generation 400. (Letters A-Z for the 8 available symbols. SE, ME, BE, respectively for Small Edible, Medium Edible, and Big Edible mushrooms; ST, MT, BT, for Small Toadstool, Medium Toadstool, and Big Toadstool)

The symbol acquisition test

To study the evolution of symbolisation it is important to establish if these apparent symbolic "verb-noun" structures are based on real symbolic relationships and if the organism is able to choose the correct verb with the name of each new edible or poisonous mushrooms. In order to analyse the type of referencing systems that organisms evolved, a symbol acquisition test was used, similar to that in Savage-Rumbaugh and Rumbaugh's (1978) chimpanzee experiments. The test was performed off-line, separated from the simulation on the evolution of language by auto-organisation. The goal was to teach organisms' neural nets a perfect "verb-noun" language. This language was imposed by providing the teaching input for the backpropagation cycle, as opposed to receiving it from the parent. The test consisted of three learning stages. In the first stage, organisms learned to label only four types of objects (large and medium toadstools, large and medium edible mushrooms). During this stage verbs were not used, and no names were taught for the remaining two categories (small edible and small poisonous mushrooms). In the second stage, organisms learned to associate the two verbs "approach" and "avoid" with the categories large/medium edible and large/medium poisonous mushrooms, respectively. At this point, it was expected that organisms would have learned the logical relationship between the names of the two edible mushrooms and the verb "approach", and the logical relationship between the verb "avoid" and the names of two toadstools. In the final stage the learning of the names of the small poisonous and small edible categories was finally introduced. The association of the two verbs with these new names was not taught. In fact, it was expected that only organisms that learned true symbolic relationships between verbs and names would be able to generalise this rule to new mushroom names.

The symbol acquisition test was repeated with ten different replications. After the three learning stages, seven populations produced the correct associations ‘small_edible’-"approach" and ‘small_toadstool’-"avoid". In three populations the learning of the names for small mushrooms did not produce the activation of the proper verb. It means that these organisms did not learn any symbolic association. In the seven successful populations, instead, the language is based on logical relationships between the mushrooms’ names and the two verbs. The relationships between words and real objects, and between verbs and objects’ name, allow neural nets to generalise the association of new names with the correct verb category. These results show that neural networks can learn simple languages that use symbolic associations.

Conclusion

The model simulation for the evolution of self-organising languages and the test of symbol acquisition show that neural nets, as chimpanzees, can be used as "models" for the study of evolution of language and of symbolisation. Some nets, such as some chimpanzees, were not able to learn a real symbolic language, even though they were apparently using languages with "verb-noun" rules. Further analyses of the nets' internal representations, and of the net's training history, will permit us to understand the conditions that lead to the acquisition of true symbolic languages. Moreover, computational models such as neural nets allow us to manipulate some of their features (e.g. the neural net architecture) to better understand the neural mechanisms for symbolisation and language acquisition.

References

Cangelosi A. (1999). Modeling the evolution of communication: From stimulus associations to grounded symbolic associations. In D. Floreano, J. Nicoud, F. Mondada (Eds.), Advances in Artificial Life, Berlin: Springer-Verlag, 654-663.

Cangelosi A. & Harnad S. (in press). The adaptive advantage of symbolic theft over sensorimotor toil: Grounding language in perceptual categories. Evolution of Communication. Presented at the 2nd International Conference on the Evolution of Language, London, April 1998.

Deacon T.W. (1997). The symbolic species: The coevolution of language and human brain, London: Penguin.

Fodor J. e Pylyshyn Z. (1988). Connectionism and cognitive architecture: A critical analysis. Cognition, 28, 3-71.

Greenfield P.M. & Savage-Rumbaugh S. (1990). Grammatical combination in Pan paniscus: Process of learning and invention in the evolution and development of language. In S.T. Parker & K.R. Gibson (eds), Language and intelligence in monkeys and apes, Cambridge University Press, 540-579.

Harnad S. (1990). The Symbol Grounding Problem. Physica D 42: 335-346

Kirby S. (1999). Function, selection and innateness: The emergence of language universals, Oxford University Press.

Marcus G.F. (1998). Rethinking eliminative connectionism. Cognitive Psychology, 37(3): 243-282.

Peirce C.S. (1955). Logic as semiotic: The theory of sign. In J. Buchler (Ed.), The philosophical writing of Peirce. New York: Dover Books.

Rumelhart D.E. e McClelland J.L. (Eds.) (1986). Parallel Distributed Processing: Explorations in the microstructure of cognition. Cambridge, MA: MIT Press.

Savage-Rumbaugh S. (1986). Ape languages: From conditioned response to symbol. New York: Columbia University Press.

Savage-Rumbaugh S. & Rumbaugh D.M. (1978). Symbolization, language, and Chimpanzees: A theoretical reevaluation on initial language acquisition processes in four Young Pan troglodytes. Brain and Language, 6: 265-300.

Savage-Rumbaugh S., Rumbaugh D.M., Smith S.T., & Lawson J. (1980). Reference: The linguistic essential. Science, 210, 922-925.

Steels, L. (1997) The synthetic modeling of language origins. Evolution of Communication, 1(1).

Conference site: http://www.infres.enst.fr/confs/evolang/

Evolution of symbolisation in chimpanzees and neural nets