Variant Graph Support to BioD
Njagi Mwaniki
null+dlang at njagi.me
Tue May 28 09:41:18 UTC 2019
Hello I’m Njagi Mwaniki,
I am part of the 2019 Google Summer of Code under the Open
Bioinformatics Foundation with a project aimed to add variation
graph support to BioD under mentors George Githinji and Pjotr
Prins.
What are variation graphs? Well it’s sequence graph that is used
to represent variation in a genome. Let me explain.
A sequence graph also an alignment graph, breakpoint graph, or
adjacency graph is a bidirected graph in which the vertices
represent segments of DNA and the edges represent adjacency
between segments in a genome. (from Wikipedia)
Sequence graphs have long been proposed as a replacement for
reference genomes which are linear structures/sequences of bases.
A variation graph is a sequence graph together with a set of
paths representing possible sequences from a population[1].
Despite these ideas being around for a long time we haven’t yet
been able to use sequence graphs in real life bioinformatics
applications such as sequence alignment or determining homology.
This is what we hope to speed up.
VG is a set of tools that already implements variation graphs but
which is a bit broad in its focus. In this project we are
building upon the existing tools and knowledge from VG and
looking for ways to improve its performance in terms of lookups
and also its application with small genomes, specifically viruses
and smaller mammals such as mice.
[1] Variation graph toolkit improves read mapping by representing
genetic variation in the reference
More information about the Digitalmars-d-announce
mailing list