Measures and algorithms for alignment-free comparative pan-genomics

Bielefeld University

Location: Bielefeld, Germany

ESR: Luca Parmigiani

Objectives

A pan-genome stored as a graph data structure can be characterized by various attributes, such as the graph’s structural properties (average node degree, diameter, density), sequence attributes (k-mer diversity or distribution), or its functional content (relative size of the core or accessory genome). Such attributes will be used to define alignment-free measures for pan-genome similarity or distance, to be used for estimating relationships between the input pan- genomes, for example using distance based phylogenetic tree reconstruction methods. Algorithms to compute these measures will be developed, implemented, tested, and applied to real data. The result will be a software tool for quantitative pan-genome comparison.

Expected Results

New distance or similarity measures for pan-genomes and efficient software for their computation.

Measures and algorithms for alignment-free comparative pan-genomics