Application process

ALPACA will train 14 PhD students at 13 different institutions (university/research institute/company). Before you apply on any position, please read the following information before you continue.

  • You apply directly at an institution (university/research institute/company)
  • You can apply at most for five different positions
  • Click on the logo/name of an institution listed under Projects, where you find more information
  • Application deadlines vary individually between January and March
  • Because applications will be discussed on a network wide level, it is preferable to apply for as many positions as possible (with the specified maximum of five positions).

Offer Description

Your profile

  • Do you have a background in computer science, mathematics, bioinformatics or artificial intelligence / data science?
  • Are you interested in exploring the individual variation of entire evolutionary ensembles, such as particular species (humans, plants), pathogens (viruses, bacteria), or certain types of cancer?
  • Would you like to become part of the next generation of experts in the design of data structures that support the safe arrangement and the efficient and biomedically useful exploitation of exabytes of individual genetic data?
  • Have you just graduated or are you in the first four years of your research career?
  • Are you interested in doing your PhD in a joint network of high profile research institutes across Europe under guidance of renowned supervisors?


  • You qualify as an Early Stage Researcher, meaning that – on the starting date of your employment with the host institute – you are in the first four years of your research career and have not (yet) been awarded a doctoral degree.
  • You have not resided and/or have had your main activity (study, work, etc.) in the country where the position is announced for more than 12 months during the 3 years prior to the starting date of your employment with the respective host institute.
  • You are proficient in English language (academic level)


  • You will get the chance to participate in specially developed lectures and courses (e.g. on specific techniques, academic soft skills, etc.);
  • Already at an early stage in your career, you can start building your personal professional network due to having your PhD project embedded in a high profile consortium, encompassing renowned universities and innovative companies.
  • You will be exposed to research in a non-academic environment, by spending one month (or more) in the non-academic sector. This will sharpen your understanding of strategies, requirements and skills for research in business environments.
  • There will be the opportunity for you to spend time and perform research with other members of the consortium. This will widen your horizon with respect to related scientific disciplines, techniques and also alternative philosophies when pursuing scientific goals


The main goal is to apply and extend techniques from haplotyping literature to the construction of pan-genome graphs. Given a multiple alignment of pan-genomic references, the founder reconstruction problem is to find a small set... Read more
The role of viruses is key for understanding the environment (e.g., in the sea, the soil or the air) or the functioning of humans, animals and plants’ microbiomes. Despite their comparatively small genome sizes, viruses pose specific challenges for ... Read more
Compacted de Bruijn graphs are natural candidates for representing pan-genome graphs. The problem of constructing compacted de Bruijn graphs has been studied extensively, in both cases where the input is a (set of) genomes or raw ... Read more
We propose to explore distinct approaches when creating or adding information to a pan-genome graph. The simplest approach is to map new sequences, indicating newly discovered variants and annotating existing ones. However, when the graph ... Read more
A pan-genome stored as a graph data structure can be characterized by various attributes, such as the graph’s structural properties (average node degree, diameter, density), sequence attributes (k-mer diversity or distribution), or its functional ... Read more
The amount of sequenced genomes, and in many areas of application also the amount of annotations, have reached a mass – hundreds of thousands of sequenced genomes – that is critical for successful application of deep learning pipelines. However, ... Read more
Despite tremendous progress in genome assembly, recalcitrant genomic loci remain whose sequences cannot be resolved. Such regions are often variable in copy number and such copy number variants (CNVs) have been linked to various disorders, including neuropsychiatric conditions and autism ... Read more
The main goal is to study new representations of pan-genomes that allow fast and space-efficient queries of multiple pan-genomes, allowing their comparison and exploiting the eventual ancestral relationships. We want to overcome the limitations ... Read more
In the specific case of a set of very closely related genomes, a pan-genome can be represented by a so-called (elastic)-degenerate text that actually corresponds to the .vcf file format or, alternatively for slightly less closely related ... Read more
Comparing pan-genomes amounts to comparing two graphs, generalizing the idea to align two genomes. We aim at developing algorithms and software for 'whole-pan-genome alignment'. Though for aligning two networks approaches already ... Read more
Growing insights in biomedical research indicates a substantial role of copy number variants (CNVs) in various diseases. CNVs are represented by duplicated or deleted parts of various lengths and can affect multiple genes or change gene dosage, lead to ... Read more
Identifying significant differences between individual samples is a key problem in many areas, including diagnostics (tumor development), met- agenomic analysis (differences in sample composition), transcription analysis ... Read more
The Positional Burrows Wheeler Transform (PBWT) is a data structure that enables efficient storage and local haplotype matching over large collections of aligned linear genome sequences with genetic variation. Recently, we ... Read more
Antibiotic resistance (ABR) is a global threat to public health, and is a property primarily conferred to bacteria either by horizontal transfer of a gene or by mutational evolution. Predictions suggest ABR will kill more people than cancer by 2050, as it ... Read more

Still got Questions?

    This form collects only your email so that we can reach you back.