more on the hgdp samples

first, see my previous post on this if you want to follow along.

in that post, i expressed some concerns over the french human genome diversity project (hgdp) samples since the ceph folks describe them as: French (various regions) relatives. i wondered both of the following: 1) how many and which “various regions,” since different regions of france have historically had different rates of inbreeding — haven’t managed to find out which “various regions” — and 2) how many and what sorts of relatives? i did find out that.

via some genetic wizardry, a noah rosenberg tried to work out if any of the individuals in any the hgdp samples were, in fact, relatives [see here]. to cut a long story short, rosenberg found it likely that two individuals in the french sample were siblings [see pg. 7 here – opens pdf], thus the “relatives” indicator on the ceph website. so, the entire french sample is NOT full of family members like i wondered in my last post — only two of the individuals sampled are likely to have been relatives.

i still think it would be useful to know from which regions the samples were drawn, but i guess i just have to live with not knowing for the meantime. (~_^) but now i feel more secure about professor harpending’s conclusion — that regarding the french: “from the viewpoint of kinship, one person is not very different from another person.”

however, now i feel unsure about the japanese samples! the hgdp samples for the japanese are described on ALFRED as:

“Collected by L.L. Cavalli-Sforza from Japanese-born individuals living in the San Francisco Bay area, and by K.K. Kidd and J. R. Kidd from Japanese-born individuals living in Connecticut.”

ack! well, how representative of japanese people in japan are these people? where did they come from? urban areas? rural areas? different areas? mostly the same areas? how old were they?

i ask all these questions because, historically, urban japanese have had lower inbreeding rates than rural japanese … and the inbreeding rates overall for japan dropped pretty sharply after wwii [see pgs. 4-5 here – opens pdf]. so if the samples include mostly young, urban japanese who recently moved to the u.s., well i wouldn’t be surprised if they look quite outbred. but if the samples include mostly older, rural japanese, i would be surprised if they looked outbred.

now i don’t have any confidence in the japanese hgdp samples — not for looking at kinship within the japanese population anyway. btw, rosenberg didn’t find any likely relatives in the japanese samples.
_____

i went through the ceph table of the hgdp samples and ALFRED and compiled a list of all the hgdp samples and if they 1) likely include any family members (“relatives” – based on rosenberg), and 2) where the samples were collected and from whom, if known. many of the samples don’t have any useful information on their provenance. for example, many of the ALFRED entries say that the samples were drawn from unrelated individuals, but rosenberg found that they, in fact, likely included relatives.

why do i care about any of this? i’ll explain that in another post. right now … coffee! (^_^)

**update: see why i care about the hgdp samples**
_____

the list:

– Central African Republic – Biaka Pygmy (relatives)
This sample is comprised of Biaka, living in the village of Bagandu, in the southwest corner of the Central African Republic (3.42N; 18E altitude approximately 500m). This group is probably an admixture of 3/4 “non-pygmy” African ancestry and 1/4 Mbuti ancestry. The transformed cell lines were established by Judith R. Kidd. The sources of this sample are L. Cavalli-Sforza (Stanford University) and K.K. Kidd, J.R. Kidd (Yale University).

– Democratic Rep of Congo – Mbuti Pygmy (relatives)
The sample is composed of Nilosaharan and Niger Kordofanian speaking Mbuti pygmies from the northeastern part of the Ituri Forest (northeastern Democratic Republic of the Congo). It was collected by L.L. Cavalli-Sforza in 1986.

– Senegal – Mandenka (relatives)
This sample from the Central African Republic is part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). This sample consists of unrelated individuals and was collected with proper informed consent.

– Nigeria – Yoruba (relatives)
Most of the Yoruba individuals in this sample are urban health care workers from Benin City, Nigeria, collected by Prof. Friday E. Okonofua and collaborators; cell lines established by Dr. J.R. Kidd.

– Namibia – San (relatives)
This sample from Namibia is part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). This sample consists of unrelated individuals and was collected with proper informed consent.

– Kenya – Bantu NE (relatives)
This sample is part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). This sample consists of unrelated individuals and was collected with proper informed consent.

– S. Africa – Bantu SE Pedi
– S. Africa – Bantu SE Sotho
– S. Africa – Bantu SE Tswana
– S. Africa – Bantu SE Zulu
– S. Africa – Bantu SW Herero
– S. Africa – Bantu SW Ovambo

These samples are part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). They include the following individuals: #993, 994, 1028, 1030, 1031, 1033, 1034, and 1035. These samples consist of unrelated Bantu speakers from southern Africa and were collected with proper informed consent.

– Algeria – Mozabite (relatives)
This sample from Algeria is part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). This sample consists of unrelated individuals and was collected with proper informed consent.

– Israel (Negev) – Bedouin (relatives)
This sample from the Negev region of Israel is part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). This sample consists of unrelated individuals and was collected with proper informed consent.

– Israel (Carmel) – Druze (relatives)
This sample is part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). The Druze, a Moslem community from Northern Israel. Collected by B. Bonne-Tamir (Tel Aviv University) as part of the repository of samples of Israeli populations. This sample contains both related and unrelated individuals.

– Israel (Central) – Palestinian (relatives)
This sample from the central region of Israel is part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). This sample consists of unrelated individuals and was collected with proper informed consent.

– Pakistan – Brahui
– Pakistan – Balochi (relatives)
– Pakistan – Hazara (relatives)
– Pakistan – Sindhi (relatives)
– Pakistan – Pathan
– Pakistan – Kalash (relatives)
– Pakistan – Burusho

These samples from Pakistan are part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). These samples consist of unrelated individuals and were collected with proper informed consent.

– Pakistan – Makrani
*no info found.*

– China – Han
This sample is part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). This is a sample of Han Chinese living in the San Francisco, California. Collected by L. Cavalli-Sforza (Stanford University), K.K. Kidd, and J.R. Kidd.

– China – Tujia
– China – Yizu/Yi
– China – Miaozu/Miao
– China – Oroqen (relatives)
– China – Daur
– China – Mongola
– China – Hezhen
– China – Xibo
– China – Uygur
– China – Dai
– China – She
– China – Lahu (relatives)
– China – Naxi (relatives)
– China – Tu

These samples from China are part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). These samples consist of unrelated individuals and were collected with proper informed consent.

– Siberia – Yakut
This sample is part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). Yakut-speaking individuals in the Yakut Autonomous Republic. Individuals sampled were living or were born along the river Lena in the area of Yakutsk and northward, roughly 129-130E, 62-64N. This sample was collected by E.L. Grigorenko.

– Japan – Japanese
This sample is part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). Collected by L.L. Cavalli-Sforza from Japanese-born individuals living in the San Francisco Bay area, and by K.K. Kidd and J. R. Kidd from Japanese-born individuals living in Connecticut.

– Cambodia – Cambodian (relatives)
This sample is part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). Collected by K. Dumars from individuals born in Cambodia who are now living in Santa Ana, California.

– France – French/various regions (relatives)
This sample form various regions of France is part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). This sample consists of unrelated individuals and was collected with proper informed consent.

– France – Basque
This sample from France is part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). This sample consists of unrelated individuals and was collected with proper informed consent.

– Italy – Sardinian
This sample from Italy is part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). This sample consists of unrelated individuals and was collected with proper informed consent.

– Italy – from Bergamo
– Italy – Tuscany

*no info found.*

– Orkney Islands – Orcadian (relatives)
This sample from the Orkney Islands is part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). This sample consists of unrelated individuals and was collected with proper informed consent.

– Russia Caucasus – Adygei
This sample is part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). Adygei-speaking people near Krasnodar in the Russian republic of Adygei, which is in the southeastern section of the country (north of the Caucuses mountains). They are culturally and linguistically distinct from neighboring Russians. This sample was collected by E. Grigorenko (Yale University) V. Galkina, and M. Kadoshnikova (Bristol company, Russia).

– Russia – Russian
This sample is part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). Sample collected by E. Grigorenko from rural communities of ethnic Russians living in the Vologda Administrative Region, about 400 km north of Moscow, roughly 59-61N, 39-41E.

– Mexico – Pima (relatives)
This sample is part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). Collected from Pima living near the eastern border of the state of Sonora, Mexico. Collected by L.O. Shulz.

– Mexico – Maya (relatives)
This sample is part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). This sample consists of Mayans who are Yucatec speakers from in the Xmaben village located in the Mexican state of Campeche in the central Yucatan peninsula. Blood and serum markers indicate European admixture to be about 10 % (K. Weiss, personal communication). Some evidence suggests that the area from which this sample was drawn served as a refuge for Maya people from across southern Mexico who fled to this more remote region during a series of revolts against the Spanish in the 19th and early 20th centuries. There are 53 transformed cell lines (106 chromosomes) established by Judith R. Kidd. The sources of this sample are K.K. Kidd and J.R. Kidd (Yale University).

– Colombia – Piapoco and Curripaco (relatives)
*no info found.*

– Brazil – Karitiana (relatives)
This sample is part of the Human Genome Diversity Cell Line Panel collected by the Human Genome Diversity Project (HGDP) and the Foundation Jean Dausset (CEPH). The sample was collected in the Karitiana village (Rondonia Province, Brazil) by F. Black. HLA haplotypes indicate that the Karitiana have no non-Amerindian admixture and are genetically distinct from other sampled populations in relative geographical proximity, such as the Surui.

– Brazil – Burui (relatives)
*no info found.*

previously: hgdp samples and relatedness

(note: comments do not require an email. remember: you’re better off just skipping it!)

Advertisement

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s