He has improved and extended the PICA framework, which uses machine learning for phenotypic trait prediction. Roman has demonstrated its applicability to large-scale genome databases and incomplete genome sequences. Most of the traits can be reliably predicted in only 60-70% complete genomes. In collaboration with colleagues from DOME Roman has established a new phenotypic model that predicts intracellular microorganisms. Romans results suggest that the extended PICA framework can be used to automatically annotate phenotypes in near-complete microbial genome sequences, as generated in large numbers in current metagenomics studies.
The results of Roman's master thesis have been presented in a talk at the RECOMB Comparative Genomics 2015 and have been published in the journal BMC Bioinformatics.
Links: