Study Clarification III

March 23, 2005

Here's an old paper from the 80s that "estimates" between 10% and 34% Negroid gene flow into Sicily based on what essentially amounts to 4 African L mtDNA markers found in a single sample of 90 people.


Mitochondrial DNA polymorphisms in Italy. III. Population data from Sicily: a possible quantitation of maternal African ancestry

Semino et al. (1989)
Ann Hum Genet

Link to Abstract

Misused Quotes:

Of particular interest is that the HpaI-3/AvaII-3 complex, which is unique to groups of African ancestry, was found in Sicily at a frequency of 4.4%. For the first time an estimate of the amount of gene flow from Blacks to the Sicilian gene pool could be obtained. […] Using the weighted mean of the frequencies of the HpaI-3/AvaII-3 marker in Senegalese and in Bantu as representative of the parental African population, the total amount of gene migration (M) from Blacks into the Sicilians was estimated according to the method of Bernstein (1931) and a value of 0.108 ± 0.053 was obtained.


The Negroid component could have been transmitted directly through the introduction of groups of Negro slaves into the island by Phoenicians and Romans and/or indirectly through Arabic migrations. […] The only Arabs for which data on mtDNA polymorphisms are available, are from Israel (Bonnk-Tamir et al. 1986). These show an incidence of the combination of interest of 12.8%, a frequency which is compatible with a Negro contribution in the Arabic gene pool of about 30%. […] If one assumes that Negro genes arrived in Sicily only through Arabs, a 0.344 ± 0.049 value would be obtained for the amount of Arab gene migration (M).

Although the actual genetic contribution from African populations could only be estimated as lying in between the two M values we calculated, this work shows that, whichever way these genes arrived, a substantial Negro component is present in the Sicilian gene pool.


This is an old study and the marker being used is not an actual haplogroup but a restriction enzyme. And the way that admixture is being "estimated" isn't very common practice in population genetics. Not only is it imprecise and unreliable, as the authors freely admit, but in this case it's based on an unusually high frequency (4.4%) of African mtDNA in Sicily, as we'll see below.

But even disregarding all that, the lower limit estimated has a margin of error of ± 5.3, so it could be as little as 5.5% (i.e. 2.75% total admixture), and the upper limit assumes huge levels of Arab admixture that are not born out by any research. Moreover, the Arab reference sample is composed of Palestinians, whom the authors "estimate" have received 30% African gene flow. Yet when Palestinians were tested with the STRUCTURE admixture program using 377 autosomal microsatellite loci, they only had 2% total African admixture (Rosenberg et al. 2002; Table 2 in the Supplementary Information), and when the same sample was typed at 642,690 SNPs, they showed no African admixture at all (Li et al. 2008; Table S1). That's a powerful example of the limitations of this study's outdated estimates.

Another study from more than a decade later, Vona et al. (2001), addressed some of these issues and the old Semino paper directly, showing that with a different sample and method, different results are obtained:

In work carried out with restriction enzymes on mtDNA in a sample of Sicilians, Semino et al. (1989) indicated the presence (4.4%) of the African complex HpaI-3/AvaII-3 (40% in Senegal and in the Bantu of South Africa). The authors hypothesized a migration of genes from Africa to Sicily, estimated at about 10%, which was introduced into the Sicilian gene pool by Black slaves brought by the Phoenicians and the Romans and/or by Arab migrations. Results at the mtDNA sequencing level, however, show no Black African influence in the Sicilian population.

Two years after that, a definitive analysis of Sicilian mtDNA was conducted by Romano et al. (2003), using a much larger sample and seven different locations. Unlike Vona et al., this study did find some Negroid maternal DNA in Sicily, but at a rate of only 0.65% (3 sequences in the sample of 465).

If we pool all of the above data — which is always a good idea — we get 7 sequences in a sample of 604, which is 1.16% maternal admixture (or 0.58% total admixture), a figure that's extremely low and comparable to admixture levels elsewhere in Europe.

This result, together with the absence of Negroid Y-chromosomes in Sicily, discredits claims that Sicilians have a Black African racial component.

Updated 11/13/2009

North African Y-chromosomes

March 7, 2005

A Predominantly Neolithic Origin for Y-Chromosomal DNA Variation in North Africa

Arredi et al. (2004)
Am J Hum Genet

ABSTRACT: We have typed 275 men from five populations in Algeria, Tunisia, and Egypt with a set of 119 binary markers and 15 microsatellites from the Y chromosome, and we have analyzed the results together with published data from Moroccan populations. North African Y-chromosomal diversity is geographically structured and fits the pattern expected under an isolation-by-distance model. Autocorrelation analyses reveal an east-west cline of genetic variation that extends into the Middle East and is compatible with a hypothesis of demic expansion. This expansion must have involved relatively small numbers of Y chromosomes to account for the reduction in gene diversity towards the West that accompanied the frequency increase of Y haplogroup E3b2, but gene flow must have been maintained to explain the observed pattern of isolation-by-distance. Since the estimates of the times to the most recent common ancestor (TMRCAs) of the most common haplogroups are quite recent, we suggest that the North African pattern of Y-chromosomal variation is largely of Neolithic origin. Thus, we propose that the Neolithic transition in this part of the world was accompanied by demic diffusion of Afro-Asiatic-speaking pastoralists from the Middle East.


Y-chromosomal studies are potentially highly informative about the origin of male-specific lineages, because of the detailed haplotypes that can be obtained and their high geographical specificity (Jobling and Tyler-Smith 2003), but previous studies have been restricted to limited regions of North Africa (Bosch et al. 1999, 2001; Flores et al. 2001; Manni et al. 2002; Luis et al. 2004). Together, these genetic analyses highlighted the similarity between northeastern Africa and the Middle East and the clear genetic differentiation between northwestern Africa and both sub-Saharan Africa and Europe, including Iberia. The Sahara and Mediterranean, despite the narrow width of the Strait of Gibraltar, seem to have acted as effective long-term barriers to Y-chromosomal gene flow.


First, as shown in fig. 1B, the lineages that are most prevalent in North Africa are distinct from those in the regions to the immediate north and south: Europe and sub-Saharan Africa. [...] Such a finding is not surprising, in the light of the earlier genetic studies, but has an important implication: despite haplogroups shared at low frequency, suggesting limited gene flow, North African populations have a genetic history largely distinct from both Europe and sub-Saharan Africa over the timescales needed for the Y-chromosomal differentiation to develop.


Indeed, the positions of the samples in the MDS plot describe a latitudinal axis, from North Africa and the Middle East in the upper part to Central and southern Africa in the lower part. Furthermore, the pattern of genetic affinities among the North African samples parallels the west-east orientation quite precisely, from Morocco on the left-hand side to Egypt and the Middle East on the right.


In conclusion, we propose that the Y-chromosomal genetic structure observed in North Africa is mainly the result of an expansion of early food-producing societies. [...] Since most of the languages spoken in North Africa and in nearby parts of Asia belong to the Afro-Asiatic family (Ruhlen 1991), this expansion could have involved people speaking a proto-Afro-Asiatic language. These people could have carried, among others, the E3b and J lineages, after which the M81 mutation arose within North Africa and expanded along with the Neolithic population into an environment containing few humans.