the outbreeding project continues apace…

…in the united states! (just as you all suspected.) amongst white folks anyway (that’s who was included in the study below).

from a study published in 2009, Measures of Autozygosity in Decline: Globalization, Urbanization, and Its Implications for Medical Genetics:

This research has definitively shown the existence of a trend for decreasing autozygosity with younger chronological age in the North American population of European ancestry. The ROHs we identified, larger than 1 Mb, are clearly representative of autozygosity due to distant consanguinity in our outbred populations, and not chromosomal abnormalities or common copy number variants. Using our predictive models of decreasing Fld, we show a quantifiable decrease in consanguinity over the twentieth century. Based on data provided in Carothers et al, this decrease in Fld found in our discovery population is on the order of individuals transitioning from a single inbreeding loop 4–5 generations prior, to no inbreeding loops within <6 generations. We postulate that the increased mobility, urbanization and outbreeding in North America in the last century has led to less consanguinity (and thus less homozygosity and homogeneity) in younger individuals.”

the researchers looked at two different sets of genomes — one from the ninds repository @the coriell institute, the other from the baltimore longitudinal study of aging (blsa). the blsa is, obviously, biased towards people on the east coast of the u.s. (in and around baltimore). glancing through the list of submitters to ninds, there’s also something of an east coast bias there, although many samples do come from other areas of the country (see the list of locations at the end of this post).

amongst the findings in this study are that 1) the number of runs of homozygosity (roh) has decreased in white americans over the last one hundred years or so, and 2) the lengths of the roh have shrunk as well. both of these are good indicators of outbreeding.

here are a couple of tables/charts from the paper (click on images for LARGER views):

measures of autozygosity in decline - table 02

measures of autozygosity in decline - percent of genome in roh

what’s interesting to contemplate, i think, is what this might mean wrt selection pressures on americans going forward? especially, what might it mean in light of european-americans encountering other, newer groups within american society that are not outbreeding so much (at least not at the moment) — newly arrived immigrants from many muslim countries, for example — or even, perhaps, latin americans (although i’m not 100% sure about how much they’ve been inbreeding over the past few hundred years or so — stay tuned!). how is that all going to play out? interesting times.

possibly related footnote — here is an abstract from the 2013 ashg conference:

“Reconstructing the Genetic Demography of the United States”

“The United States (U.S) is a complex, multiethnic society shaped by immigration and admixture, but the extent to which these forces influence the overall population genetic structure of the U.S is unknown. We utilized self-reported ancestry data collected from the decennial U.S Census 2010 and allele frequency data from over 2000 SNPs for over 40 of the most common ancestries in the U.S. that were available from the Pan Asian Single Nucleotide Polymorphism (PASNP), Population Reference Sample (POPRES), 1000 Genomes, and Human Genome Diversity Panel (HGDP) databases. We utilized the relative proportions of individuals of each ancestry within each county, state, region and nation and calculate the weighted average allele frequency in these areas. We reconstructed the genetic demography of the U.S by examining the geographic distribution of Wright’s Fst. Shannon’s diversity index, H was calculated to assess the apportionment of genetic diversity at the county, state, regional and national level. This analysis was repeated stratifying by race/ethnicity. We analyzed households with spouses, using the phi-coefficient as a measure of assortative mating for ancestry. This analysis was repeated stratifying by age of the spouses (older or younger than 50). Most of the genetic diversity is between ancestries within county, but this varies by race/ethnicity, and ranges from 95% for Whites to 43% for Hispanics illustrating that the White ancestries are relatively homogeneously scattered throughout the U.S whereas the Hispanic ancestries show significant clustering by geography. Analysis of the mating patterns show strong within ethnicity assortative mating for American Indian/Alaska Natives, Asians, Blacks, Hispanic, Native Hawaiians/Pacific Islanders, and Whites, with φ = 0.30, 0.864, 0.92, 0.863, 0.478 and 0.832 respectively (P<1×10-324 for each) and significantly less correlation in the younger cohort. These results show demographic patterns of social homogamy which are slowly decreasing over time. One major implication is that data collected from different locations around the U.S are susceptible to both within- and between-location population genetic substructure, leading to potential biases in population-based association studies.”
_____

origin cities of the ninds samples (from a quick-ish glance):

Burlington, VT
Lebanon, NH
Boston x 10
New York x 7
Albany
Rochester, NY x 4
New Haven x 3

Bethesda x 7
Baltimore x 5
Philadelphia
Washington, D.C.

Winston-Salem, NC
Charleston, SC

Atlanta
Birmingham x 3
Augusta

Jacksonville x 4
Tampa
Gainesville

Cincinnati x 5
Cleveland
Lexington
Louisville
Memphis
Indianapolis, IN
Ann Arbor

Chicago x 3
Springfield, IL
Rochester, MN
Minneapolis
Englewood, CO
Kansas City

Houston x 4

Phoenix
Salt Lake City

Los Angeles
Irvine, CA
Fountain Valley, CA
San Diego x 2
San Francisco x 3
_____

previously: runs of homozygosity and inbreeding (and outbreeding) and runs of homozygosity in the irish population and western europeans, runs of homozygosity (roh), and outbreeding and russians, eastern europeans, runs of homozygosity (roh), and inbreeding

(note: comments do not require an email. funky penguin!)

inbreeding and cognitive ability among whites in the u.k.

via dienekes via jayman:

Genome-wide estimates of inbreeding in unrelated individuals and their association with cognitive ability

“INTRODUCTION

“Research on consanguineous marriages, and other forms of inbreeding, has long shown a reduction in cognitive abilities in the offspring of such unions. The presumed mechanism is that detrimental recessive mutations are more likely to be identical by descent in the offspring of such unions and so have a greater chance of being expressed. To date, research on the relationship between inbreeding and cognitive ability has largely been restricted to recent inbreeding events as determined by pedigree…. It has been suggested that intellectual disability is under negative selection, and that recent deleterious mutations have an important role in the underlying aetiology. The wealth of molecular genetic data currently available allows estimates of inbreeding on a genome-wide level and to examine the effects of long-term ancestral levels of inbreeding. Such an association with inbreeding, as measured by runs of homozygous polymorphisms (ROH), has previously been identified with several behavioural traits, such as schizophreniz, Parkinson’s disease and personality measures, as well as non-behavioural traits such as height.

“The relationship between inbreeding on a population level and cognitive ability is particularly interesting due to assortative mating, non-random mating, which is greater for cognitive ability than for other behavioural traits, as well as physical traits such as height and weight. Positive assortative mating has been reported for cognitive ability, particularly for verbal traits, with spousal correlations generally around 0.5. Assortative mating should lead to greater genetic similarity between mates at causal loci for cognitive ability and to a lesser extent across the genome, which in turn reduces heterozygosity at these local. In other words, in contrast to the genome-wide reduction of heterozygosity caused by inbreeding, the reduction of heterozygosity due to assortative mating for a trait is limited to loci associated with the trait…. Another difference between inbreeding and assortative mating is that the effects of inbreeding are expected to be negative, lowering cognitive ability, whereas the effects of assortative mating affect the high, as well as the low end of the ability distribution, thus increasing genetic bariance, that is, when high-ability parents mate assortatively, their children are more likely to be homozygous for variants for high ability, just as offspring of low-ability parents are more likely to be homozygous for variants for low ability….

“MATERIALS AND METHODS

“Participants

“The Twins Early Development Study (TEDS) recruited over 11 000 families of twins born within England and Wales between 1994 and 1996…. In this analysis, individuals were excluded if they reported severe current medical problems, as well as children who had suffered severe problems at birth or whose mothers had suffered severe problems during pregnancy. Twins whose zygosity was unknown or uncertain or whose first language was not English were also excluded. Finally, analysis was restricted to twins whose parents reported their ethnicity as ‘white’….

“Cognitive measures

“Verbal and non-verbal tests were administered using web-based testing. The verbal tests consisted the Similarities subtest and the Vocabulary subtests from the Wechsler Intelligence Scale for children (WISC-III-UK). The non-verbal tests were the Picture Completion subtest from the WISC-III-UK and Conceptual Grouping from the McCarthy Scales of Children’s Abilities. A general score was derived from the test battery as the standardized sum of the standardized subtest scores, which correlates 0.99 with a score derived as the first principle component of the test battery score.

“Runs of homozygosity

“FROH was defined as the percentage of an individual’s genome consisted of runs of homozygosity (ROH)…. [O]nly ROH with a minimum of 65 consecutive SNPs covering 2.3Mb were used when calculating the total proportion of the genome covered by ROH. In addition, the required minimum density in a ROH was set at 200kb per SNP, and the maximum gap between two consecutive homozygous SNPs was set at 500kb….

“RESULTS

“Table 1 includes descriptive statistics for FROH and the three measures of cognitive ability (general, verbal, and non-verbal). FROH is slightly positively skewed, as it represents the total percentage of the genome that includes runs of homozygosity (ROH). The average percentage of genome covered by ROH was 0.7% (95% CI 0.65-0.72%). Verbal and non-verbal abilities correlate 0.49; because general cognitive ability is the sum of the standardized verbal and non-verbal subtests, they correlate much more highly with general ability (0.87 and 0.86, respectively).

inbreeding and iq - table 01

“Table 2 presents the results of the linear regression analyses. No significant regression was found between FROH and the cognitive measures after correction for multiple testing, although the association with non-verbal cognitive ability was nominally significant (P=0.03). Although this association was not statistically significant, it is noteworthy that every regression in Table 2 is *positive*, indicating that increased homozygosity tends to be associated with *higher* cognitive scores across different measures of cognitive ability (general, verbal and non-verbal).

inbreeding and iq - table 02

“Our analysis identified 87 loci where ROH overlapped in 10 or more individuals. For these overlapping regions we tested for association with each of the cognitive measures and again showed no significant associations after correction for multiple testing (P-values of less than 5.7 x 10-4). A sign test of the direction of effect across all ROH showed a disproportionately large number of *positive* associations, indicating that ROH are associated with higher cognitive ability (P=0.002). The sign test was non-significant for verbal ability but highly significant for non-verbal ability (P<10-6). The sign test for non-verbal ability alone remained significant after correcting for an individual’s genome-wide FROH score (P<10-6).

“As explained earlier, positive assortative mating can also lead to genome-wide homozygosity for trait-specific loci, and, unlike inbreeding, assortative mating can affect the high as well as the low end of the ability distribution. One possible explanation for the trend suggesting a positive correlation between homozygosity and cognitive scores in our data is that positive assortative mating on intelligence might be greater for high cognitive ability individuals….

“DISCUSSION

“Our results show that within a representative UK population sample there was a weak nominally significant association between burden of autosomal runs of homozygosity and higher non-verbal cognitive ability. This nominal association with *increased* cognitive ability is counterintuitive when compared with the results from more extreme inbreeding based on pedigree information. A potential explanation for this direction of effect is that individuals with higher cognitive ability might show greater positive assortative mating, which would lead to increased homozygosity at loci for higher cognitive ability in their offspring. However, in a separate sample we showed that greater positive assortative mating was not associated with higher cognitive ability. While these findings seem to provide clear evidence against this hypothesis, it is possible that the genome-wide genetic finding reflect historical mating habits that no longer exist today. It should also be noted that there was a reduction in the standard deviations for spousal correlations in the increased cognitive ability groups by an average of 6% compared with the decreased cognitive ability group (see Table 3), which could reflect less genetic variability in the high ability couples or a ceiling effect on the cognitive tests. This lesser phenotypic variability at the high ability end would have a small effect in reducing the spouse correlations and potentially confound our analysis….

“Overall, these results highlight the importance of understanding mating habits, such as inbreeding and assortative mating, when investigating the genetic architecture of complex traits such as cognitive ability. The results certainly suggest that there is no large effect of FROH on reduced cognitive ability, the expected direction of effect. The nominally significant associations found in this study may even suggest that in the case of non-verbal cognitive ability, beneficial associations with homozygosity at specific loci might outweigh the negative effects of genome-wide inbreeding and that the relationship between inbreeding and cognitive ability may be more complicated than previously thought.
_____

so, although obviously Further Research is RequiredTM, these researchers have concluded that both the absence of reduced cognitive ability and the slight increase in cognitive ability which they found in individuals who had runs of homozygosity (roh) in their genomes (evidence of matings between genetically similar individuals) were probably NOT due to assortative mating (i.e. smart people mating with smart people).

furthermore, they suggest that the inbreeding-causes-reduced-cognitive-ability meme is incorrect — or at least that the situation is more complicated than the idea that it’s the accumulation of recent deleterious mutations which haven’t been selected away that is the (whole) problem. in fact, a little inbreeding seems to have a positive effect on some cognitive abilities!

i’ve suggested a couple of times one way in which inbreeding might result in a low average iq in a population, and that is if the inbreeding leads to clannish, altruistic behaviors between extended family members which then result in the deleterious mutations NOT being weeded out.

one real world example i’ve offered is how life works in egyptian villages and how the more successful and affluent (and, presumably, more intelligent) members of a clan are obliged to help out their less successful and poorer (and, presumably, less intelligent) clan members. so, apart from mentally retarded individuals not reproducing, where is the negative selection for deleterious mutations here? there is none. or it’s a lot weaker than in more individualistic societies (like gregory clarks’ medieval england) where it’s more every man for himself — in clannish societies, deleterious mutations might be able to hang around for a long time, riding on the coattails of those with fewer deleterious mutations.

(note: comments do not require an email. i’m my own grandpa! [no, I’M not! it’s just the song.])

runs of homozygosity in the irish population

so, after all my rambling about the historic mating patterns amongst the native irish, how inbred are the irish really?

from Population structure and genome-wide patterns of variation in Ireland and Britain:

[O]ur results suggest that the Irish population has the largest proportion of the genome in ROH (as measured by FROH1), relative to the British and HapMap CEU populations examined here (Figure 3).”

the members of the ceu population are mormons in utah. here is figure 3 — click on images for LARGER view:

ireland - roh01 - o'dushlaine et al

Figure 3 – FROH1 patterning in Irish, British and Swedish populations. Box plots represent (a) the number and (b) the summed size of segments of the autosomal genome that exists in ROH of 1 Mb or greater in length (ie, FROH1). The bars represent mean and confidence intervals, as per a standard box plot (box indicating the 25th–75th percentile of the FROH1 distribution, line within box representing the median and ends of the whiskers representing the 5th–95th percentiles). Outliers are represented by diamonds.”

so the irish: more AND longer roh or runs of homozygosity (1 Mb in length or greater) than the english, the utah mormons, scots in aberdeen, or the swedes — in that order (if i’m not mistaken). so the english here are the most outbred (what have i been saying?), while the irish are the most inbred.

more from the paper:

“Overall, the Irish and Swedish populations seem slightly different from the others in the context of ROH. Both the Irish and Swedish populations showed, on an average, a greater number of ROH, an increased maximum ROH length, as well as an increased proportion of the genome in homozygous runs, compared with that of the Scottish, southern English and Utah populations. Similarly, the mean level of individual autozygosity per population as measured by FROH22 was highest for the Irish group (Figure 4). Together, these results suggest slightly increased autozygosity in the Irish cohort compared with the British and Swedish cohorts.”

here’s figure 4:

ireland - roh02 - o'dushlaine et al

Figure 4 – Mean FROH1 and FROH5 patterning in Irish, British and Swedish populations. See Figure 1 legend for population identifiers. Y-axis indicates the average proportion of the autosomal genome covered by FROH1 or FROH5 (see Materials and Methods for definition of FROH).

“Autozygosity is generated by increased levels of kinship, which in turn reflects the population history of Ireland. Although relatively undisturbed by secondary migrations, the population of Ireland has undergone expansions and contractions at numerous points in recent history (eg, two major famines since 1600, disease epidemics, expansion in the first half of the 19th century). Aside from these features, the increased autozygosity may also reflect legacies of Gaelic family structures and comparatively low levels of migration that are in part due to a lack of industrial revolution in Ireland.

“To test a hypothesis of increased autozygosity due to features of relatively recent population history, we examined the patterning of homozygosity looking for signals of parental relatedness over the last four or five generations. Previous work has illustrated that parental relatedness arising within four to six generations predominantly affects ROH over 5 Mb in length.22 We therefore compared this statistic across populations. Results show that the Irish and Swedish populations have around 10 times as much of their genomes in ROH over 5 Mb in length than the southern English, and 1.5–3 times as much as Scotland and Utah (Figure 4)….

“Analysis of ROH is a powerful method to gauge the extent of ancient kinship and recent parental relationship within a population. This is because ROH arise from shared parental ancestry in an individual’s pedigree. The offspring of cousins have very long ROH, commonly over 10 Mb, whereas at the other end of the spectrum, almost all Europeans have ROH of ∼2 Mb in length, reflecting shared ancestry from hundreds to thousands of years ago. By focussing on ROH of different lengths, it is therefore possible to infer aspects of demographic history at different time depths in the past.22 We used FROH measures to compare and contrast patterning across populations. These measures are genomic equivalents of the pedigree inbreeding coefficient, but do not suffer from problems of pedigree reconstruction. By varying the lengths of ROH that are counted, they may be tuned to assess parental kinship at different points in the past. We used two different measures, FROH1, which includes all ROH over 1 Mb and hence includes information on recent and background parental relatedness, and FROH5, which sums ROH over 5 Mb in length, more typical of a parental relationship in the last four to six generations.22 Our FROH1 results indicate slightly elevated levels in the Irish and Swedish populations (compared with southern England, Scotland and HapMap CEU) of both the overall number of ROH and the proportion of genome in ROH (see Figure 3). This pattern was exaggerated when we restricted analysis to ROH greater than 5 Mb in length (ie, FROH5, see Figure 4), indicating increased levels of parental relatedness in the last six generations in the Irish and Swedish populations compared with other populations tested in this study. When we remove individuals with ROH over 5 Mb from the FROH1 analysis (Supplementary Figure S5), Ireland remains as the population with the most homozygous runs and the longest sum length of homozygosity. This provides further evidence that the elevated proportion of shorter ROH, and hence the number of ancient pedigree loops in Ireland, is indeed real and not driven by a limited number of offspring of cousins.

recent cousin matings, they mean.

so, if you look at figure 4, both the irish and the swedes have way more roh of over 5 Mb in lenth than the english (who have a really miniscule amount), the scots in aberdeen, or the mormons in utah (ceu) — in that order. in this instance, the swedes appear to have the most roh over 5 Mb, but as the authors say, when they removed the over 5 Mb individuals from the samples (i.e. the individuals most likely to be the offspring of recent cousin marriages), the irish wind up having the most and the longest roh over 1 Mb in length, so they win the overall inbreeding prize for these groups.

what the authors overlook, i think, is the longer term mating patterns of these populations. i think that the english in this study (and, it should be noted, that these are described as individuals from the south and southeast of england) have miniscule amounts of roh in their genomes because, out of all these groups, they have been outbreeding the longest (see “mating patterns in europe series” ↓ below in left-hand column) — since the early part of the middle ages, in fact. the irish and the swedes, on the other hand, have more roh because they started outbreeding much later (and, probably, too, because, like other northern populations, they’re somewhat remote and small in size) — the swedes sometime after they converted to christianity in — when was it? — ca. 1000 a.d.? and the irish, as i’ve shown in the last few posts on irish mating patterns, not until sometime towards the late medieval period — as late as the 1500s possibly.

the implication of all this is, because the irish and the swedes (and other groups in europe) inbred for longer than the english (and some of the french and dutch and germans), their societies would’ve remained clan- or extended-family based for longer than those of the english et al., and so would’ve been under different sorts of selection pressures from their social environment.
_____

update: Supplementary Figure S5 – when the researchers removed the individuals with roh over 5Mb, i.e. those individuals who were most likely to be the offspring of cousins (see comments):

ireland - roh03 - o'dushlaine et al

previously: runs of homozygosity and inbreeding (and outbreeding) and western europeans, runs of homozygosity (roh), and outbreeding and russians, eastern europeans, runs of homozygosity (roh), and inbreeding and early and late medieval irish mating practices and clannish medieval ireland and inbreeding in europe’s periphery and early modern and modern clannish ireland and meanwhile, in ireland… and drinkin’ and fightin’ songs and mating patterns, family types, and clannishness in twentieth century ireland and inbreeding in ireland in modern times

(note: comments do not require an email. clan map of ireland.)

western europeans, runs of homozygosity (roh), and outbreeding

i know, i know — it’s easier to spot inbreeding (or outbreeding) from the presence (or absence) of a lot of long runs of homozygosity (roh) in the genomes of individuals in a population rather than short roh (see for example the central/south and west asians in this post, populations which everyone knows are regular inbreeders), but i haven’t got any data on long roh for separate, sub-populations (like italians vs. europeans), so we’re gonna have to make do with short roh (for now). and anyway, even the amount of short roh is reduced via outbreeding (and increased via inbreeding), so you can use it as a tool to try to work out a population’s mating history. it’s just not as easy/obvious as with longer roh.

so … the map below is taken from Genomic and geographic distribution of SNP-defined runs of homozygosity in Europeans.

the samples come from:

the rotterdam study – the netherlands
popgen – northern germany – specifically the schleswig-holstein region (in deutsch if you like)
– the monica augsburg surveys – southern germany – from the city of augsberg and two neighboring counties
– and popres, which, since this is a study of europeans, i presume must mean that the samples came from both the lolipop study in london and the colaus study, lausanne, switzerland — i discussed those two studies in this previous post (scroll down).

again, the problem with taking samples from people living in big cities is that, even if they may be natives of whatever country they happen to live in, they, or some of their recent ancestors, may have migrated to the city — so, who knows, for instance, if the samples from rotterdam tell us anything about rotterdam or even the region of the country in which rotterdam is located. probably tells us something about the dutch, but even then….

these researchers — nothnagel et al. — chose to look at roh that were 1Mb in length. that’s shorter than the 1.5Mb roh as delineated by the researchers who looked at the roh in russian populations. also, nothnagel et al. weighted the average roh in each population according to how much linkage disequilibrium was (estimated to be) present in each population. don’t ask! no, really — don’t ask, because i don’t really understand why they did this. here’s the wikipedia page for linkage disequilibrium. i know that you can have more ld in an inbreeding population and — you guessed it! — less in an outbreeding one. and, of course, other things like bottlenecks can affect how much ld is present in a population. nothnagel et al. found different amounts of ld in the populations in this study and compensated for that, but again i’m not exactly sure why.

anyway … here’s what they found. this map shows the subpopulation averages of the weighted number of roh per individual (the contour lines are guesstimates — educated guesstimates, but still guesstimates):

europe roh - average weighted ROH number per individual

if you look closely, you’ll see that there’s a sort-of central band of a relatively low average number of roh (between 37-39) that runs from southern england down through beligum/the netherlands (rotterdam) and northeast france, southern germany and switzerland. and, as the researchers observed, and as we saw in the previous post on russia, the numbers of roh increase going northwards and decrease going south. until you get to southern spain and southern italy, southern greece, and (probably) a central spot in the balkans there, all regions where the average number of roh increases again. the researchers suggest that, perhaps, migration from northern africa to the iberian peninsula (that’s the only region for which they offer a possible explanation for this anomaly) explains the longer roh there — presumably they’re thinking of a bottleneck. maybe. but perhaps it’s due to greater historic inbreeding in southern spain — and southern italy and greece and the balkans. some data showing longer roh would help us tell one way or the other.

the researchers, btw, acknowledge that the areas indicated as having very low amounts of roh — colored in the lightest shades of yellow — i.e. northwest spain and eastern europe — are probably artifacts of the interpolation method that they used. also, for all you scots out there (you know who you are! (^_^) ), while i do predict that the average numbers of roh in scotland ought to be higher there than in england, note that there was no data for scotland included in this study, so the shades of the contours up there are wild guesses as well.

i’m quite surprised by the very low levels of roh in romania, but remember that one has to read this map with the underlying north-south differences in numbers of roh in mind, so perhaps the roh in romania really indicates an inbreeding/outbreeding rate in romania that is more like that found in, say, france/germany. dunno. in any event, it’s very interesting.

now i want to compare the average number of roh in eastern europe with western europe. that’s going to be kinda hard to do since 1) the two studies used different roh lengths (1Mb vs. 1.5Mb), and 2) the numbers from this study have been weighted. still, i think we can get at something of a (very!) rough picture by taking the numbers from germany as our starting point and using them to calibrate the results from the two studies. we can do this, i think, since the samples from germany came from the same sources in both studies — the popgen study for northern germany and the monica study for southern germany.

in the russian study, the samples from northern and southern germany were combined, so we only have one number for germany — which was lower than all the results from eastern europe, typically much lower (see map from previous post). the number of roh in the polish sample, for instance, was more than twice that found for the germans. the average number of roh in russia (Rus_HGDP) was also twice that of the germans. czechs, latvians, estonians — all higher than the germans.

now if we work westwards from germany using the results from the study in this post — the english, the dutch (rotterdam), and the swiss are all in the same range as the southern germans, while the southern french have an even lower average number of roh — and the irish (in dublin) and the czechs are in the same range as the northern germans. so all of these populations — and even the spanish and italians — have fewer roh on average than eastern europeans. which is what i would’ve guessed given what we know about the historic mating patterns of europeans beginning in the early medieval period (see mating patterns in europe series below ↓ in left-hand column).

maybe there’s another explanation for this difference between western and eastern europe — and for the apparent differences between central and southern europe. like i said above, a study or two looking at longer roh would help to clear up the picture one way or the other.

previously: russians, eastern europeans, runs of homozygosity (roh), and inbreeding and ibd and historic mating patterns in europe and ibd rates for europe and the hajnal line and runs of homozygosity and inbreeding (and outbreeding) and runs of homozygosity again

(note: comments do not require an email. ruh roh!)

russians, eastern europeans, runs of homozygosity (roh), and inbreeding

greying wanderer (thanks, grey!) pointed out to me (via) a very interesting study of russian/eastern european genetics which includes some runs of homozygosity (roh) data (which can provide clues of inbreeding/close matings among other things): A Genome-Wide Analysis of Populations from European Russia Reveals a New Pole of Genetic Diversity in Northern Europe. (dienekes has a really good explanation of roh here.)

in this latest study, khrunin et al. took a look at a handful of different ethnic russian sub-populations (from different locations in russia) as well as some other eastern european groups. most of the samples from russia they collected themselves — the rest came from other studies. here’s a list of which groups were included and where they came from:

– russians (n=384) from the archangelsk (mezen district, n = 96), vladimir (murom district, n = 96), kursk (kursk and oktyabrsky districts, n = 96), and tver (andreapol district, n = 96) regions
veps (n=81) from the babaevo district of the vologodsky region
komi (n=150) from the izhemski (izhemski komi, n = 79) and priluzski (priluzski komi, n = 71) districts of the komi republic.

all of these samples were collected by the authors — except for those from tver — and the researchers ensured that the subjects AND their parents were originally from whatever region in which they happened to find them (i like that!).

the data from other studies which they used are described in this paper and include:

– finns – samples from helsinki (n = 100) and kuusamo (n = 84) – kuusamo is really remote
– estonians (n = 100) – samples collected across the entire country
– latvians (n = 95) – samples collected in riga – parents had to be latvians
– poles (n = 48) – from the west-pomeranian region, so just on the border with germany
– czechs (n = 94) – from prague, moravia, and silesia
– germans (n = 100) – from schleswig-holstein in the north and the augsburg region in the south
– italians (n = 88) from tuscanyhapmap
– russians (n = 25) from the human genome diversity panel (hgdp) – i believe from the vologda oblast.

the data collected by khrunin et al. are really good, imho, since 1) they went to all the trouble of collecting samples from different regions of russia, and 2) the researchers tried to control for ethnic/regional origin. the quality of the data from all the other studies is kinda mixed, for my interests anyway. for instance, taking in samples in large, capital cities — meh — not so great. the residents of those cities could’ve come from all over the country. the northern versus southern sampling in germany is better; unfortunately, those data sets were combined together in this study (they’re kept separate in another really cool study which i will post about soon!). the estonian data set is interesting because the samples came from across the country. otoh, the polish data set is also interesting because it’s from such a specific region (and right on the border with germany).

ok. one last thing before i show you the results (i made a map!). different researchers define roh differently (*sigh*) — while there do seem to be some standards, there’s also quite a bit of variation, and different researchers choose to look for roh of varying lengths. in this study, the researchers looked for roh that were 1.5Mb in length (i’ve seen other researchers look for 1Mb in length). 1.5Mb is pretty short as far as roh go. if you recall, when a population has a lot of longer roh (like 4-8Mb or more), that’s a pretty good indicator of inbreeding. 1.5Mb — not so much. lots of short roh are a better indicator of something like a population bottleneck in the distant-ish past. but, what’s a girl to do? gotta work with what’s available, and if it’s short roh, so be it.

here (finally!) is the map. i took the data from this table. the map (first column of data) is of the average number of roh (of 1.5Mb) found in individuals in the different populations (nROH):

russia nroh

the most obvious thing to note is that the small, endogamous groups (the veps and the komi) have more roh than any of the other populations, except for the finns up in kuusamo (and i think that that’s probably due to a bottleneck — ethnic finns really only migrated to, and began to settle in, the area seriously in the 1600s, and i imagine it wasn’t very many of them — and being so far away from anybody else!). the veps and the komi are small populations and, historically, they didn’t marry out much (that’s why we have veps and komi people today), so they are somewhat inbred. definitely more so than the surrounding population.

another curious thing is the pretty high number of rohs in the baltic populations: latvians=0.58, estonians=0.61, and finns in helsinki=1.13. wow! what happened there? that’s something like three to five times the number of roh we see in italians (from tuscany) or germans.

the most interesting point for me, though, is that there is an east-west divide. it’s kinda vague, maybe, but i think it’s there: italians (tuscans) and germans at ca. 0.20, and then the czechs and poles right next door at 0.35 and 0.51 respectively. and everyone to the east, except the russians in kursk, higher again than those two figures. i think these results hint at what i’ve found in the history books on medieval europe, i.e. that western europeans began outbreeding earlier than eastern europeans and as a result wound up being more outbred. (see, for example, here and here — and the “mating patterns in europe series” below ↓ in left-hand column.)

finally, the authors of the study point out how it appears that the average number of roh in individuals in a population increases with latitude — and they mention that this has also been shown elsewhere (i’ll be posting on that paper — very soon!). if you look at the various ethnic russian populations, for instance, the russians down in kursk (Rus_Ku=0.28) and murom (Rus_Mu=0.39) have fewer roh than the russians further to the north in tver (Rus_Tv=0.49) and way up in mezen (Rus_Me=1.63!). however, the hgdp russian samples, apparently from the vologda oblast which is pretty far north, have relatively low numbers of roh (Rus_HGDP=0.44), so that doesn’t seem to fit. still, it does look like a real pattern to me. the authors suggest that this is due to the general pattern of how europe was settled (from the south to the north), as well as the fact that the farther north you go, the fewer people there are to mate with (so the more inbred you wind up being).

as i’ll show in my next post, though, while there does seem to be a north-south pattern to roh frequency in europe with more roh in populations to the north than the south, curiously the numbers seem to increase in southern europe as well (as compared to places in central europe like germany and france) — and strangely in the balkan region as well. i can’t imagine why! (^_^)

previously: ibd and historic mating patterns in europe and ibd rates for europe and the hajnal line and runs of homozygosity and inbreeding (and outbreeding) and runs of homozygosity again

(note: comments do not require an email. kuusamo traffic jam!)

ibd and historic mating patterns in europe

**update 08/03: post fixed to remove references to roh which i got wrong (roh≠blocks of ibd!) — see comments below (thanks, citrus!)**

princenuadha points me to this awesome pdf which i guess was a presentation given at a society for molecular biology and evolution (smbe) conference last weekend (thanks, prince!).

here is an interesting graphic from the presentation (pg. 21):

what this map shows are the means of runs of homozygosity (remember those?) blocks of identity by descent (ibd) that are greater than 1cM for each of these european populations. the longer the ibd blocks, the greater the identity by descent, and vice versa. small circles=fewer long blocks of ibd; large circles=more long blocks of ibd.

if a population has lots of short blocks of ibd, then its genetics are all mixed up, possibly due to outbreeding or because of a fairly recent mixing with another population. if a population has lots of long blocks of ibd, then its genetics are not so mixed up and the individuals within it share a lot of identity by descent. this can be an indicator of having been squeezed through a bottleneck or close inbreeding over time.

here are the mean numbers of long blocks of ibd for some of the countries on the map:

as you can see, my “core europeans” (english, french, germans, dutch, prolly some others) all have low means of blocks of ibd. the smallest circles are found right in the center of nw europe: england, france, belgium, germany. also italy (more about that below). in the immediate periphery around core europe, the circles are a bit larger, i.e. there are more long blocks of ibd: scotland, ireland, spain, portugal, switzerland, greece, scandinavians. eastern europeans have even larger circles/even more long blocks of ibd: poles, russians. and populations in the balkans, like the albanians, have enormous circles, i.e. LOTS of long blocks of ibd.

all of that fits the pattern i’ve been talking about here on the ol’ blog (see the mating patterns series below in the left-hand column): that the core europeans have been outbreeding the most and for the longest, with peripheral europeans lagging behind that trend, and eastern europeans really lagging behind the trend. i haven’t actually discussed the balkan populations (yet), but i do know that cousin/endogamous marriage rates are pretty high in the balkans.

i wonder if the numbers for italy may be unrepresentatively low, but it’s difficult to know. the data used are from popres and, like so much genetic data out there, have no provenance info attached to them. so, are the italian data from northern italy (which has a long history of outbreeding) or southern italy (which has a lot of inbreeding) or a combination of both? dunno.

this is a very cool study! i like it a lot. (^_^)

polish gen also has an interesting post about the presentation, btw.

(note: comments do not require an email. ruh roh!)