University of Milano-Bicocca

Location: Milan, Italy

Supervisor: Prof. Paolo Bonizzoni

Co-supervisor: Gianluca Della Vedova

ESR: Jorge Avila

Objectives

The main goal is to study new representations of pan-genomes that allow fast and space-efficient queries of multiple pan-genomes, allowing their comparison and exploiting the eventual ancestral relationships. We want to overcome the limitations of the usual BWT-based indexing of a single genome, by extending the known approaches, as well as introducing new ones based on colored de Bruijn graphs (dBG), to a set of pan-genomes. Moreover, we plan to investigate how to compare a set of reads, possibly a mixture of short and long reads with a set of pan-genomes as well as other graph-based representations of gene structures. For this purpose, colored succinct de Bruijn graphs based on BWT representations, where several millions of colors are used to encode the information of reads will be developed and applied to pan-genome comparison.

Expected Results

New algorithms and tools for the comparison of sets of pan-genomes and the comparison of a set of pan-genomes and a set of read.

Representations for the comparative and hierarchical analysis of pan-genomes