Bachelor and master thesis projects

Bachelor theses

The practical course "300224 PP Genome analysis of prokaryots - applied bioinformatics for the analysis of a genome sequence" is well suited for a bachelor project in the Biology curriculum.
For bachelor theses in other curricula please contact Thomas Rattei.

Master theses

Our research projects provide permanently new topics for master theses. We are happy to adapt the topic of a thesis work to your experience and interests. Please contact Thomas Rattei for more information.

Thesis project example 1: Extraction of prokaryotic phenotypes from literature

The accessibility of almost complete genome sequences of uncultivable microbial species from metagenomes and from direct sequencing of clinical isolates necessitates computational methods predicting microbial phenotypes solely based on genomic data. Phenotypic traits of microbes can be very diverse. They range from morphologic and physiological traits to specific molecular or metabolic capabilities. The broad evolutionary diversity of microbial traits will be a substantial challenge for computational methods. Our group has developed a machine learning approach (PICA; Feldbauer et al., 2015) based on comparative genomics, which could so far be successfully trained for the prediction of about 20 traits.

The task of this thesis is to provide more and better training data for machine learning of microbial phenotypes. We need to automatically extract links between microbial species names and phenotype descriptions from open-access scientific literature. This problem is challenging due to the ambiguity of scientific terms and species names in free text. We will therefore utilize and adapt established approaches for term disambiguation and entity recognition in text.

Solving this problem in a master thesis will allow you to make a major contribution of one of the hottest research areas in microbial computational genomics. You should have a good background in computational science, bioinformatics and life science. You should be interested in programming, text mining and machine learning, as well as in microbiology and microbial ecology. The thesis project will provide you substantial training in these fields, and allows you to develop your own ideas and concepts within the frame of the project.

THESIS PROJECT EXAMPLE 2: SUPERVISED machine Learning for PHENOTYPE prediction

About 100 years after the first use of antibiotic drugs, infectious disease is not yet eradicated. Bacterial infections have kept high ranks even in the recent statistics of the World Health Organization on the global burden of disease and on mortality. Although most bacterial infections are easily cured in modern healthcare, epidemic outbreaks are still possible and multi-resistant pathogens are an increasing problem especially in hospitals. Besides developing novel options in treatment and prevention, we also need to improve monitoring and diagnostics of human pathogens. Due to the rapidly improving techniques and decreasing costs for sequencing DNA, genome-based bacterial diagnostics is currently changing from science-fiction to a real option. Rapid and precise sequencing of bacterial genomes would not only allow for better diagnostics and risk assessment, but would also allow to develop personalized treatment.

Genome-based bacterial diagnostics fundamentally challenges the current generation of bioinformatic methods for genome analysis. New concepts need to be established for the prediction of complex phenotypic traits, such as virulence. For our web based tool EffectiveDB we have developed genome-based models for the prediction of intact and functional virulence factors: the Type III, Type IV and Type VI protein secretion systems. These models work well for most bacteria, but make unreasonable predictions for several species. In this project we will extend genome-based models for the prediction of virulence factors. We will incorporate expert knowledge into the models, to improve their predictive performance.

Approaching this problem in a master thesis will give you practical insight and experience in comparative genomics of important human and plant pathogens. You should have a good background in computational science, bioinformatics and life science. You should be interested in programming and machine learning, as well as in microbiology, molecular biology and microbial ecology. The thesis project will provide you substantial training in these fields, and allows you to develop your own ideas and concepts within the frame of the project.

THESIS PROJECT EXAMPLE 3: rapid phenotype prediction for metagenomes

The investigation of microbial communities, organismal communities inhabiting all ecological niches on earth, has in recent years been strongly facilitated by the rapid development of experimental, sequencing and data analysis methods. Novel experimental approaches and binning methods in metagenomics render the semi-automatic reconstructions of near-complete genomes of uncultivable bacteria possible. Such genome-centric metagenomics approaches are now used in different areas of life science, e.g. in medicine, microbiology and microbial ecology. User-friendly, efficient and powerful computational tools are needed for the analysis of metagenomic data.

In this project we will implement a novel, web-based platform for the automatic analysis of draft genomes from metagenomes. It should allow users to quickly analyze thousands of genomes, including the prediction of phenotypic traits. Besides the analysis of user-defined data we will also provide pre-calculated predictions for all publicly available microbial genomes. Components of the web platform already exist in our group, such as phenotype models (PICA) and an internal database of publicly available complete genome sequences. The project will therefore focus on the conceptual design of the web platform, a prototype implementation and performance testing.

Implementing theweb platform in this master thesis will allow you to create a highly important and so far missing tool for microbial computational genomics. You should have a good background in computational science, bioinformatics and life science. You should be interested in programming, web frameworks and databases, as well as in microbiology and microbial ecology. The thesis project will provide you substantial training in these fields, and allows you to develop your own ideas and concepts within the frame of the project.