Our projects

During millions of years of evolution, land plants have created an astonishing variety of bioactive specialized metabolites (also referred to as ‘secondary metabolites’ or ‘natural products’) to support their defense and ecological adaptation. Such molecules often interact with human molecular receptors, thus providing an essential source of chemical scaffolds for the development of new medicines. About 25% of prescription drugs currently in use originated from plants.

The structural and stereochemical complexity of plant metabolites often renders their chemical synthesis unfeasible, but recent progress in sequencing and metabolomics technologies has opened a new avenue towards heterologous production of plant metabolites using their native biosynthetic enzymes. However, the discovery and engineering of metabolic pathways from plants remains very difficult. Our lab combinines novel computational (e.g., machine learning) and experimental approaches to develop rapid, generally applicable workflows for the discovery and utilization of bioactive molecules derived from plants.

Exploring Biosynthesis & Chemodiversity of Natural Products

Above are examples of specific molecules that we are interested in. Kavalactones from kava are well known for their anti-anxiety effects. We have recently elucidated their biosynthetic network and are currently developing new types of kavalactones using metabolic engineering (see video below).
Resiniferatoxin is the most potent activator of the human pain receptor TRPV1 and is under development as a new type of analgesic for severe pain.
Aconitine is a sodium channel toxin that is used in China as an analgesic and a blood coagulant. Due to its intricate interlocking hexacyclic ring system and the elaborate collection of oxygenated functional groups, aconitine presents a rare example of a small molecule that no chemist has ever been able to synthesize. In our lab, we are working to elucidate the biosynthetic pathway of aconitine, uncovering the enzymatic steps that nature employs to construct this complex molecule. 
Additionally, we are investigating the biosynthetic origins of Piperaceae alkaloids. This plant family is widely recognized as a rich source of bioactive natural products and has been used worldwide as a spice and in traditional medicine, including Ayurveda. For every plant family we study, we also focus on mapping its chemodiversity using LC-MS and various computational metabolomics approaches. This allows us to better understand the chemical complexity and potential bioactivity of the natural products we discover.
In collaboration with the Botanical Garden of the City of Prague, we are running large LC-MS screening campaigns on their plant collection (13,800+ unique plant species) to discover structurally-novel phytochemicals and molecular scaffolds. This project is run in coordination with the Digital Botanical Garden Initiative, which aims at the digitization of chemo- and biodiversity of plants at a global scale.

Computational metabolomics

In our lab, we have developed advanced computational tools to enhance the identification and interpretation of mass spectrometry (MS) data. We focus on machine-learning approaches and scalable data analysis platforms to improve metabolite annotation and molecular structure elucidation.

MassSpecGym – the first comprehensive benchmark for the discovery and identification of molecules from MS/MS data. 
MZmine 3 – A scalable MS data analysis platform supporting hybrid datasets, including LC–MS, GC–MS, IMS–MS, and MS imaging.
DreaMS – A  transformer-based foundation model for tandem mass spectrometry achieving state-of-the-art results across mass spectrum interpretation tasks (e.g., spectral library search or identification of fluorinated molecules).


Machine Learning Powered Enzyme Development

Currently, it is impossible to accurately predict the functions of newly discovered enzymes directly from their amino acid sequence or to generate enzymes for specific reactions. We are capable of creating new versions of known enzymes, but no one has yet managed to develop entirely new enzymes for reactions we’ve never seen before.

We are working to overcome this biotechnology bottleneck by developing machine learning models for the prediction of functions and de novo generation of a single well-defined class of enzymes - terpene synthases (TPSs). TPSs are ubiquitous enzymes that produce the core hydrocarbon scaffolds for the largest and the most diverse class of natural products called terpenoids. Our lab curates a freely available database of terpene synthases and their reaction mechanisms.

To further this goal, we’ve developed TerpeneMiner, a state-of-the-art machine-learning pipeline for automated TPS detection and substrate prediction. From the protein sequence only, our method can detect TPSs and their substrates with high accuracy, even for rare classes of TPSs. By leveraging our predictive pipeline, we were the first to report three experimentally confirmed active TPSs in Archaea.


TerpeneMiner

Bioinformatics projects

Our lab maintains the following bioinformatics projects:

We regularly participate in the Google Summer of Code program!

Funding

Our work is possible thanks to grants provided by these funding sources: