Evolution of Language

3^rd Conference
The Evolution of Language
April 3rd - 6^th , 2000

Abstracts

Correlation between genetic and linguistic
differentiation of human populations:
The specific action of linguistic boundaries on gene flow ?

Isabelle Dupanloup de Ceuninck1, André Langaney1,2, Laurent Excoffier1

1 Laboratoire de Génétique et Biométrie, Département d'Anthropologie, Université de Genève, Suisse.
isabelle.dupanloup@anthro.unige.ch
2 Laboratoire d'Anthropologie Biologique, Musée de l'Homme,
Paris, France

Introduction

Several linguists have proposed recently to classify the 5'000 actual world's languages in only slightly more than 20 families and a few isolates . Among these linguists, several authors support the hypothesis of a common origin of these linguistic families and isolates. But this proposal, and also the proposal of the existence of intra and inter-continental families of languages, is far from gaining a large support in the field of genetic linguistics.

The human population geneticists have tried recently to compare their findings to those of the linguists. In the last decades, several studies have revealed a strong correlation between the genetic and linguistic (of high level) differentiation of human populations at both world-wide scale and continental scale . These results can be interpreted as the trace of two alternative or concommitant processes: the coevolution of genetic and linguistic structures and/or the drastic reduction of the exchange of genes between human populations by linguistic boundaries.

The study of human genetic polymorphisms seems to indicate a relatively recent coalescence of the genealogies of all human populations. The actual human populations would have diverged recently from a small ancestral population in which probably a low level of linguistic diversity, in the sense of the number of independent linguistic lineages, would have been observed. Thus, even though the validity of the existence of large families of languages can not be discussed in the genetic linguistics field, the genetic of human populations seems favorable to the hypothesis of a common origin of the actual human languages. Moreover, the correlation observed between the genetic and linguistic differentiation of human populations can be interpreted as the results of common divergence of genes and languages in the history of these populations.

This correlation could nevertheless have been induced by the action of linguistic boundaries, i.e. the transition zones between the repartition areas of languages and families of languages, on the exchange of genes between populations. If these frontiers effectively reduced the exchange of migrants and thus the gene flow between populations, the genetic differentiation of populations speaking different languages would be more important than those between populations speaking the same language because the last ones echanged more genes than the former ones. We will thus observe a strong correlation between linguistic and genetic diversity.

We have developped an original methodology to test if linguistic frontiers correspond to barriers of genetic contacts between populations . We do not reject the hypothesis of a synchronisation of the genetic and linguistic differentiation of human populations. This hypothesis, although really difficult to test, is not incompatible with the proposal that certain frontiers between languages are not permeable to free gene flow.

We propose here to present our new method and the results of its application to the linguistic boundary between afro-asiatic and indo-european populations tested by classical and molecular markers (RH system , Y chromosome-specific p49a,f/TaqI restriction polymorphisms ).

Linguistic boundaries: Segmentation and evaluation

The aim of our method is to evaluate the "permeability" of linguistic boundaries that is to say to estimate the reduction of gene flow between populations exerted by these cultural boundaries. We determine if populations speaking different languages are more differentiated genetically than populations belonging to the same linguistic group. The hypothesis tested here is: do linguistic boundaries correspond to genetic boundaries ?

Our method is based principally on the comparison of the genetic distances between populations of the same linguistic group and between populations located on each side of the linguistic boundary under evaluation. We used an isolation by distance model to estimate the value added by the linguistic frontier under study to the genetic distance expected between populations located on each side of this frontier taking into account the geographical distribution of the populations. This genetic distance added by the linguistic boundary, called da statistic, is tested by permuting the populations on each side of this frontier and re-computing this statistic after each permutation round to get its null distribution.

We also test whether the distribution of gene frequencies representing the groups separated by the frontier under study differ from each other by use of the analysis-of-molecular-variance approach (AMOVA) . The variance component FCT due to the variation between the groups of populations separated by the linguistic frontier is estimated and its significance is tested by use of a nonparametric permutational procedure .

As the processes acting on different portions of the frontier may be heterogeneous, we divide the frontier into segments of arbitrary sizes and then evaluate the "permeability" of each segment independently. Some portions of the linguistic boundary may indeed act as strong genetic barriers whether or not they correspond to an ecological frontier; some others may alternatively not enhance genetic differentiation. The goal of the segmentation of the linguistic boundary under evaluation is to understand at a finer scale the genetic processes at work along this boundary. As for the whole frontier, we associate to each segment analyzed, an FCT value to estimate the genetic variation between the groups of populations on each side of the segment and the value of the genetic distance added by the segment to the genetic distance expected between populations located on each side of this segment.

Application of our methodology to the afro-asiatic/indo-european case

We have chosen to study the impact on gene flow of the linguistic frontier separating the afro-asiatic and indo-european populations which are well characterized for classical as well as molecular markers. The afro-asiatic and indo-european language families are associated according to several linguists in the nostratic super-phylum. These two groups are separated in the western part of their repartion area by the Mediterranean sea and join in the eastern zone in the Middle-East. Acccording to a study of the Rhesus and GM polymorphisms in the language families defined by , we notice that these two families show close genetic peculiarities, the genetic proximity of the afro-asiatic populations of North Africa and the indo-european populations of Europe, the differentiation of the afro-asiatic populations of East Africa in the sense of populations of other african linguistic groups (khoisan, nilo-saharan and niger-kordofanian language families) and a large diversity of the indo-european populations of western Asia and India.

The whole afro-asiatic and indo-european linguistic boundary seems to have reduced the gene flow between populations (RH system, Y chromosome-specific p49a,f/TaqI restriction polymorphisms) but the impact of this frontier on population differentiation seems quite heterogeneous, especially in the middle section of the frontier (RH system). The Mediterranean Sea does not constitute a real barrier to gene flow, as is documented for historical times, with the development of commercial routes between the northern and southern part of the Mediterranean Sea. This result is also compatible with the independent colonization of the northern and southern coasts from the Middle East with following subsequent contacts between populations of the two immigration waves.

References