Through their metabolism, chassis organisms such as E. coli natively have the ability to synthesize numerous chemical compounds. A proper mapping of such compounds into their associated enzymes is essential to the field of metabolic engineering as it can inform us about which heterologous pathways can be inserted as effectively as possible. We define the metabolic space as the network of all the metabolites that can natively be produced (or consumed) using a given set of reactions. The term of “space” stresses the idea that any compound that lies into a metabolic space is related to other compounds through (i) structural similarity and (ii) the ability to be transformed using available biochemical reactions. With the advances in whole-cell modeling, information used to build the metabolic space of one (or several) organism(s) is increasingly thorough and easier to access. Nevertheless, the effective exploitation of those data in order to build a metabolic space for specific applications still requires expert-knowledge and heavy preprocessing.
We propose to share our recent experiences based on the application of retrosynthesis to highlight some points worth to consider when browsing through the metabolic space. In particular, we discuss data sources integration and modeling, the comparison of metabolic spaces, and the use of KNIME workflows for user-friendly and reproducible research. Using reaction rules, our KNIME workflows enable one to extend the metabolic space of chassis strain used in biotechnology and to search for pathways linking any molecule of the chemical space to the metabolic space.