Institutional Repository
Technical University of Crete
EN  |  EL

Search

Browse

My Space

Accelerated inference of positive selection on whole genomes

Alachiotis Nikolaos, Vatsolakis Charalabos, Chrysos Grigorios, Pnevmatikatos Dionysios

Full record


URI: http://purl.tuc.gr/dl/dias/982F7B77-2332-4EED-BB1B-B9171932DBD0
Year 2018
Type of Item Conference Full Paper
License
Details
Bibliographic Citation N. Alachiotis, Ch. Vatsolakis, G. Chrysos and D. Pnevmatikatos, "Accelerated inference of positive selection on whole genomes," in 28th International Conference on Field-Programmable Logic and Applications, 2018, pp. 202-209. doi: 10.1109/FPL.2018.00041 https://doi.org/10.1109/FPL.2018.00041
Appears in Collections

Summary

Positive selection is the tendency of beneficial traits to increase in prevalence in a population. Its detection carries theoretical significance and has practical applications, from shedding light on the forces that drive adaptive evolution to identifying drug-resistant mutations in pathogens. With next-generation sequencing producing a plethora of genomic data for population genetic analyses, the increased computational complexity of existing methods and/or inefficient memory management hinders the efficient analysis of large-scale datasets. To this end, we devise a system-level solution that couples a generic out-of-core algorithm for parsing genomic data with a decoupled access/execute accelerator architecture, thereby providing a method-independent infrastructure for the rapid and scalable inference of positive selection. We employ a novel detection mechanism that mostly relies on integer arithmetic operations, which fit well to FPGA fabric, while yielding qualitatively superior results than current state-of-the-art methods. We deploy a high-end system that pairs Hybrid Memory Cube with a mid-range FPGA, forming a high-throughput streaming accelerator that achieves 751x, 62x, and 20x faster analyses of simulated genomes than the widely used software tools SweepFinder2 (1 thread), OmegaPlus (40 threads), and SweeD (40 threads), respectively. Importantly, our solution can scan thousands of human genomes and millions of genetic polymorphisms (1000 Genomes dataset, 5,008 samples) in a matter of hours, requiring between 4 and 22 minutes per autosome, depending on the chromosomal length.

Services

Statistics