PICA: Microbial phenotype prediction based on comparative genomics


The accessibility of almost complete genome sequences of uncultivable microbial species from metagenomes necessitates computational methods predicting microbial phenotypes solely based on genomic data. Roman Feldbauer, master student at CUBE, has investigated how comparative genomics can be utilized for the prediction of microbial phenotypes. He has improved and extended the PICA framework, which uses machine learning for phenotypic trait prediction. Roman has demonstrated its applicability to large-scale genome databases and incomplete genome sequences. Most of the traits can be reliably predicted in only 60-70% complete genomes. In collaboration with colleagues from DOME Roman has established a new phenotypic model that predicts intracellular microorganisms. Romans results suggest that the extended PICA framework can be used to automatically annotate phenotypes in near-complete microbial genome sequences, as generated in large numbers in current metagenomics studies.

The results of Roman's master thesis have been presented in a talk at the RECOMB Comparative Genomics 2015 and have been published in the journal BMC Bioinformatics.