Summary: | Variant calling is a major challenge in data-sets pertaining to large populations due to the difficulty in providing a consistent set of calls at all possible sites, particularly when the data is of low coverage. A further challenge is the computational cost associated with variant calling which increases exponentially with increase in the number of samples. 1000 Genomes Project provides data of 26 ethnic groups spread across the globe with an aim to capture genetic variants with frequencies of at least 1% in population. Samples sequenced have varied coverage ranging from low (2-4X) to high coverage (50X).
The present work includes variant calling for a South Asian population named GIH (Gujarati Indian from Houston, Texas). The main objective is to call genetic variants using different strategies viz., joint calling, multi-sample pooled calling and single sample calling of the GIH population. The predicted variants promise to provide clues to find biological markers in complex multi-gene diseases.
|