Evaluation of sample size effect on the identification of haplotype blocks

Osabe, Dai; Tanahashi, Toshihito; Nomura, Kyoko; Shinohara, Shuichi; Nakamura, Naoto; Yoshikawa, Toshikazu; Shiota, Hiroshi; Keshavarz, Parvaneh; Yamaguchi, Yuka; Kunika, Kiyoshi; Moritani, Maki; Inoue, Hiroshi; Itakura, Mitsuo

doi:10.1186/1471-2105-8-200

Total for the last 12 months

number of access : ? 件

number of downloads : ?

Use this link to cite this item : https://repo.lib.tokushima-u.ac.jp/118432

ID	118432
Author	Osabe, Dai Fujitsu Limited\|Japan Biological Information Consortium Tanahashi, Toshihito The University of Tokushima Nomura, Kyoko Fujitsu Limited\|Japan Biological Information Consortium Shinohara, Shuichi Fujitsu Nagano Systems Engineering Limited Nakamura, Naoto Kyoto Prefectural University of Medicine Yoshikawa, Toshikazu Kyoto Prefectural University of Medicine Shiota, Hiroshi The University of Tokushima Tokushima University Educator and Researcher Directory KAKEN Search Researchers Keshavarz, Parvaneh The University of Tokushima Yamaguchi, Yuka The University of Tokushima Tokushima University Educator and Researcher Directory Kunika, Kiyoshi The University of Tokushima Moritani, Maki The University of Tokushima Inoue, Hiroshi The University of Tokushima Itakura, Mitsuo Japan Biological Information Consortium\|The University of Tokushima Tokushima University Educator and Researcher Directory KAKEN Search Researchers
Content Type	Journal Article
Description	Background: Genome-wide maps of linkage disequilibrium (LD) and haplotypes have been created for different populations. Substantial sharing of the boundaries and haplotypes among populations was observed, but haplotype variations have also been reported across populations. Conflicting observations on the extent and distribution of haplotypes require careful examination. The mechanisms that shape haplotypes have not been fully explored, although the effect of sample size has been implicated. We present a close examination of the effect of sample size on haplotype blocks using an original computational simulation. Results: A region spanning 19.31 Mb on chromosome 20q was genotyped for 1,147 SNPs in 725 Japanese subjects. One region of 445 kb exhibiting a single strong LD value (average \|D'\|; 0.94) was selected for the analysis of sample size effect on haplotype structure. Three different block definitions (recombination-based, LD-based, and diversity-based) were exploited to create simulations for block identification with θ value from real genotyping data. As a result, it was quite difficult to estimate a haplotype block for data with less than 200 samples. Attainment of a reliable haplotype structure with 50 samples was not possible, although the simulation was repeated 10,000 times. Conclusion: These analyses underscored the difficulties of estimating haplotype blocks. To acquire a reliable result, it would be necessary to increase sample size more than 725 and to repeat the simulation 3,000 times. Even in one genomic region showing a high LD value, the haplotype block might be fragile. We emphasize the importance of applying careful confidence measures when using the estimated haplotype structure in biomedical research.
Journal Title	BMC Bioinformatics
ISSN	14712105
NCID	AA12034719
Publisher	BioMed Central\|Springer Nature
Volume	8
Start Page	200
Published Date	2007-06-14
Rights	This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
EDB ID	398874
DOI (Published Version)	10.1186/1471-2105-8-200
URL ( Publisher's Version )	https://doi.org/10.1186/1471-2105-8-200
FullText File	bmcb_8_200.pdf 1.46 MB
language	eng
TextVersion	Publisher
departments	Medical Sciences AWA Support Center Institute of Advanced Medical Sciences