Modélisation Formelle de Réseaux de Régulation Biologique 2019

Les posters présentés à l'École


Thomas Denecker
Multi-Omics Data Integration to Model Iron homeostasis in pathogenic yeast Candida glabrata


Candida glabrata is a pathogenic yeast responsible for human fungal infections. If these infections are usually punctual over time, it can become chronic, mainly in people with severe immunity deficits. During infections, yeast cells must adapt their metabolism to very different environments. The case of metals such as iron, is a perfect illustration. As a commensal organism in the intestinal flora, Candida glabrata is adapted to an environment where iron is available, whereas in the case of infection, the same yeast must survive in blood circulation and epithelial tissues, where iron resources are extremely limited. Access to iron resources is thus a critical element in relationships between hosts and pathogens.
Iron is essential for multiple cellular processes. Constant balances between iron utilization, iron storage, iron transport and iron uptake are required to maintain iron homeostasis. Molecular mechanisms are well described in the model yeasts Saccharomyces cerevisiae and Candida albicans and these two species are generally considered as paradigms for non-pathogenic and pathogenic species, respectively. Still in recent studies, Candida glabrata has been presented as an interesting hybrid model combining features from pathogenic and non-pathogenic yeasts. Our aim is therefore to perform in depth explorations of transcriptomics data to decipher iron homeostasis in Candida glabrata.
In collaboration with two experimental research teams, we accessed transcriptomics data, in which Candida glabrata gene expression was monitored in conditions inducing iron starvation or iron overload in yeast cells. These data were (i) collected and organized in a local database, (ii) carefully inspected to evaluate their consistency in the light of biological knowledges we have in model yeast species, and (iii) used to perform statistical analyses to define lists of candidate genes as being involved in iron homeostasis in Candida glabrata. Further explorations are under progress to better understand the functional roles of these gene lists.


Marine Louarn
Increasing life science resources re-usability using Semantic Web technologies


In life sciences, current standardization and integration efforts are directed towards reference data and knowledge bases. However, original studies results are generally provided in non standardized and specific formats. In addition, the only formalization of analysis pipelines is often limited to textual descriptions in the method sections. Both situations impair the results reproducibility, their maintenance and their reuse for advancing other studies. Semantic Web technologies have proven their efficiency for facilitating the integration and reuse of reference data and knowledge bases.
We thus hypothesize that Semantic Web technologies also facilitate reproducibility and reuse of life sciences studies involving pipelines that compute associations between entities according to intermediary relations and dependencies.
In order to assess this hypothesis, we considered a case-study in systems biology (http://regulatorycircuits.org), which provides tissue-specific regulatory interaction networks to elucidate perturbations across complex diseases. Our approach consisted in surveying the complete set of provided supplementary data to reveal the underlying structure between the biological entities described in the data. We used this structure to integrate data with Semantic Web technologies and formalized the Regulatory Circuits analysis pipeline as SPARQL queries.
Our result was a 335,429,988 triples dataset on which two SPARQL queries were sufficient to extract each single tissue-specific regulatory network.


Léo Gerlin
Metabolic Modeling of a Plant-Pathogen Interaction


Plant pathogens are responsible of major agricultural losses. To find agricultural practices able to restrain the spread of these organisms, the first step is to have a better understanding of the pathogens biology. For this purpose, major advances were made in the field of molecular biology, like the discovery of sophisticated regulatory and virulence systems. Nevertheless, the metabolic behavior of these organisms has been poorly studied. Their trophic preferences are poorly understood, so as the link between pathogenicity and metabolism.
Metabolic modeling is an approach developed to explore the metabolic capabilities of an organism. It relies on genetic, metabolic and physiological data (acquired by sequencing, metabolomics/fluxomics and physiology). It predicts phenotypic parameters like growth and secretion of extracellular compounds and estimate the used metabolic routes inside the cell. The main objective of my thesis is to use metabolic modeling to unravel the trophic preferences of plant pathogenic bacteria and their impacts on pathogenicity and interactions with the host.
As only few metabolic networks of plant pathogens exist, the first part of my PhD consisted in reconstructing the metabolic network of the bacterium Xylella fastidiosa. It allowed to understand the relation between metabolism and the pathogen lifestyle. It also unraveled metabolic specificities, which explain some traits of the pathogen, notably its remarkably slow growth.
Then, a second step, currently on-going, consisted in generating a multiorgan metabolic model of the tomato plant and calibrating it to experimental data. This model represents a whole plant and simulates the exchange compartments between its organs, where the core of the virulence process takes place for our pathogens' models (Ralstonia solanacearum and Xylella fastidiosa).
The next step of this work is to integrate the pathogen metabolic network to the whole plant model. This multi-organism system is challenging on a modeling point of view since the parasitic relationship between the bacteria and the plant does not verify the common metabolic modeling approximations (e.g no dynamics of metabolites inside the system). To tackle this issue, different methodological approaches are under consideration.


Jules Gilet
Single-cell RNAseq enables modeling the thymic development of Mucosal-Associated Invariant T cells.


Mucosal-Associated Invariant T cells have a unique specificity for microbial metabolites presented by the MHC-1b molecule, MR1. They display antimicrobial activity, and can release cytotoxic mediators upon activated by TCR signaling and by external cytokines. As a subset of T lymphocytes, MAIT development occurs in the thymus where they acquire an effector-memory phenotype under the control of the key transcription factor ZBTB16. This particular maturation process is in contrast with mainstream T cells that egress from the thymus with a naive phenotype before populating the secondary lymphoid organs.
While an increasing body of knowledge is available on the mechanisms driving the cell differentiation and development of NKT cells, much less is known about MAIT cells, notably due to the rarity of these cells in conventional laboratory mice strains. We make use of a clean wild-derived B6-CAST/MAIT strain (characterized by 20 times more frequency of MAIT cells in thymus) in conjunction with fluorescent labeled MR1 tetramers loaded with a MAIT ligand (5-OP-RU) to isolate murine thymic MAIT cells. With the use of a droplet-based single cell technology (10x) we captured the transcriptomic profiles of individual MAIT cells undergoing positive selection and thymic differentiation.
A graph-based clustering method (louvain) alongside dimension reduction techniques (tSNE, UMAP) allows to identify different subsets of thymic MAIT according to their differentiation process, from an immature state (ZBTB16-) to a late/mature MAIT1 (ZBTB16+TBX21+) and MAIT17 (ZBTB16+RORC+) phenotype. More importantly, as the gene expression profile of the captured cells shows a continuum in their expression pattern, a continuous rather than any discrete representation better helps to understand how the fate of MAIT subsets is programmed. With the use of non linear dimension reduction techniques (diffusion map) or semi-unsupervised machine-learning algorithms (DDRTree), we have been able to represent the MAIT differentiation process in a pseudo-time scale, and are able to reconstitute the order of the transcriptional events during their development. Finally, by the use of network inference techniques, we identified genes regulatory relationships, identifying the key transcription factor controlling the differentiation and the maturation of murine MAIT subsets in the thymus.
Altogether, these approaches allow to decipher the molecular mechanisms and the genetic events occurring during the development of MAIT cells in the thymus.


Yvan Sraka
Kappa site-graph patterns equations resolution


Kappa is a rule language for modeling dynamic systems, mainly in molecular biology.
I present here the work of my Master 2 internship supervised by Jérôme Feret which consists in designing an OCaml library in the Kappa static analyzer to reason about the potential contexts in which certain mechanistic interactions can be applied in a given Kappa model.
Consider a mechanistic interaction, which can be executed in certain specific contexts, each one encoded as a pattern in the precondition of a rewrite rule. Then, we may wonder "What is the overall context ?" (i.e. the union of all the elementary contexts) under which this interaction may be applied, we can be interested into the set of contexts which are not covered, whether there may be some contexts for which several rules apply the same interaction.
There emerges an algebraic structure of Boolean lattices (union, conjunction, set complement, ...) allowing to reason on contexts and sets of contexts. Note that not all context sets can be expressed as single Kappa patterns, some can only be expressed in the form of a set of Kappa patterns. This algebra of contexts is coded in a first order logic.
Closure operators are used to enforce constraints coming from the actual Kappa structure itself and structural properties that can be inferred by static analysis. The efficiency of the implementation is reached by a choice of data structure close to binary decision diagrams.
The resulted outcome is several features that may assist the writing and the refinement of models : it helps the modeler to detect modeling errors and missing cases of rule application, and to better understand the causal structures between the different rules of a model.


Usha D. Appadu
Formal modelling of the impact of Pleurotus mushroom on energy metabolism in liver cancer


Liver cancer is the second leading cause of cancer related death worldwide. Despite the latest medical advances, most liver cancer cases are diagnosed at a very late stage due to its asymptomatic nature. Moreover, conventional chemotherapies are limited by the development of drug resistance and various other side effects. Because of their non-toxic nature and bio- pharmacological potential, metabolites derived from mushrooms are being studied as an alternative in cancer therapy. Several studies have demonstrated the anticancer, antioxidant, antiinflammatory and also hepatoprotective effects of polysaccharide-protein complexes derived from the Pleurotus mushrooms. One such metabolite from Pleurotus mushrooms is ergothioneine , EGT, which has demonstrated in vivo anticancer traits in liver cancer. These findings may hence suggest the use of mushrooms as potential dietary prophylactics in cancer chemoprevention. The aim of this study is to understand the preventive mechanism of pleurotus mushrooms in liver cancer. Hence, a system biology approach on energy metabolism in liver cancer cells can be envisaged by abstraction of EGT mediated pathways. The methods which are going to be used are based on discrete models of the regulation of energy metabolism in cancer and normal cells. Formal methods such as CTL model checking will also be applied in order to confront traces of the models to observations.


Déborah Boyenval
A Discrete Cell Cycle Model : From Phases Characterization toward Observable Properties Verification


The cell cycle is series of events that lead to correct duplication of a cell DNA (S-phase) and its equal distribution into two daughter cells (M-phase). Progression through cell cycle is driven by a regulatory network of cyclin-dependent kinases (CDKs) and phosphatases. Recent studies highlight non-canonical functions of CDKs and phosphatases notably in regulation of carbon and energy metabolism according to cell cycle phases (G1, S, G2 and M phases).
Based on an extended René Thomas' modeling framework, a discrete model of the regulation of cell cycle has been designed. Then, parameterization has been constrained using formal methods such as model checking and ad hoc discrete Hoare logic. Model checking tests if a so-called model (interaction graph associated with a parameterization) satisfies CTL formulas expressing biological behavioural properties. Hoare logic constrains parameter values so that the regulatory network dynamics is compatible with a biological trace.
In this study, the cell cycle has been considered as a biological trace, determined from experimental observations of the sequence of regulatory events across cell cycle phases. This model will be used to elucidate causal relation between the cell cycle coupled with other biological systems on the one hand (e.g. the metabolism or circadian clock) and phase-dependent phenotypes experimentally observed on the other hand. One prospect is the understanding of metabolic reprogramming in healthy and cancer cells.


Laetitia Gibart
Pancreas cancer modeling: a metabolic approach


Pancreatic ductal adenocarcinoma (PDAC) is the most common pancreatic cancer type. During the initial cancer stages, PDAC remains asymptomatic. Hence, the late diagnosis prevents surgery in most cases. This explains why PDAC has the very low survival rate of 5%. In the tumor, epithelial PDAC cells are surrounded by cancer associated fibroblasts (CAFs), that confer a limited nutrient and oxygen resources. To survive to this poor intake, epithelial PDAC cells have an adapted metabolism. Moreover some cancer cells can undergo epithelial-mesenchymal transition (EMT) and invade distant organs to form malignant metastases.
It is supposed that the dialogue between CAF and epithelial cancer cell promotes EMT transition by messengers' exchanges. Until today, none of the biological experiments succeeds in finding the nature and direction of those exchanges : biologists have only knowledge about cell supernatants. Those contain many molecules that could be involved in cellular communications, whether in one direction or the other between epithelial cancer cells and CAFs. This leads to a huge number of hypotheses to test. Modelling those communications by coupling might reduce the number of hypotheses and help us select biological experiments.
The first purpose of the work is to build a discrete model of energetic metabolism regulatory network of the three cell types involved in PDAC. The second aim is to test several coupling assumptions and to retain only those leading to consistent predicted phenotypes. This coupling step will be the most difficult task because of the combinatorics of communications to consider between the three cell types. Therefore it is crucial to abstract the metabolism regulatory models at the appropriate level to limit the search space.


Julien Martinelli
A Statistical Learning Algorithm for Inferring Reaction Networks from Time Series Data


With the automation of biological experiments and the increase of quality of single cell data that can now be obtained by phosphoproteomic and time lapse videomicroscopy, automating the building of mechanistic models from these data time series becomes conceivable and a necessity for many new applications. While learning numerical parameters to fit a given model structure to observed data is now a quite well understood subject, learning the structure of the model is a more challenging problem that previous attempts failed to solve without relying quite heavily on prior knowledge about that structure.
In this paper, we consider mechanistic models based on chemical reaction networks (CRN) with their continuous dynamics based on ordinary differential equations, and finite time series about the time evolution of concentration of molecular species for a given time horizon and a finite set of perturbed initial conditions.
We present a statistical learning algorithm to learn CRNs with a time complexity for inferring one reaction in $\mathcal O(t.n^2)$ where $n$ is the number of species and $t$ the number of observed transitions in the traces. We learn both the structure and the reaction rates of the CRN. We evaluate this algorithm and its sensitivity to its statistical threshold parameters, first on simulated data from a hidden CRN considering successively reactant-parallel CRN, product-parallel CRN and general CRN, and second on real videomicroscopy single cell data about the circadian clock and cell cycle progression of NIH3T3 embryonic fibroblasts.
In all cases, our algorithm is able to reconstruct meaningful CRNs. We discuss some limits according to the existence of multiple time scales and highly variable traces.