We have structured the IRTG around the three core areas of high-throughput genomics (HG), computational biology (CB) and developmental systems biology (DS).
Integrating projects across these three distinct core areas, we have defined competitive PhD topics that address current questions of general interest in the field, e.g.
Thesis topics from different areas complement one another, with several projects often attacking the same problem from different perspectives; students will need to work together to answer their research question.
With the advent of genome wide sequencing approaches, it has been possible to map the linear genomic information from a variety of organisms. The development of chromatin immunoprecipitation helped to identify molecular signatures of biological function across the genome and, importantly, how these signatures vary across cell types and organs. As technologies improve, we have obtained an increasingly fine-scaled understanding of genomic function with the inclusion of new features such as nucleosome positioning and ‘open’ chromatin states permissive to regulation through the binding of specific transcription factors. Deep transcriptome sequencing has expanded our view of RNA from coding messenger RNA to include nascent transcripts and non-coding RNAs, including transcription at enhancers (eRNA) and long, non-coding RNA molecules that have structural and regulatory functions which we are only beginning to understand. How different transcription events are regulated remains unexplored. Lastly, technologies that map chromatin 3D topology show the importance of 3D interactions for gene regulation at multiple scales.
While high-throughput methods allow for extensive characterization of regulatory landscapes, our ability to infer causal relationships from high-throughput data requires advanced computational approaches that can incorporate and infer the inherent heterogeneity of cells and regulatory mechanisms. Improved technologies for perturbing and dissecting complex regulatory networks are also essential and are the focus of several PhD projects, aimed at fine-scale perturbations of regulatory processes as well as at single-cell approaches for tissues and organisms comprised of cells defined by distinct regulatory states. These projects will provide students with skills that go beyond descriptive regulatory profiling to the dissection and manipulation of gene expression regulation in complex tissues.
Data arising from high-throughput technologies cannot be analyzed without sophisticated bioinformatics. Genomics, as a key technology for studying regulatory networks in the cell, is tightly coupled to modern data analysis techniques collectively referred to as machine learning. Bioinformatics students that work in the labs of computational experts, such as the co-PIs from Duke and Berlin, are continuously developing new approaches; biology students need to be familiarized with these tools. In the context of the joint Berlin-Duke interdisciplinary projects, students will be exposed to a broad spectrum of computational techniques and how they are applied to experimental data for studying gene regulation.
Several game-changing technologies are behind the data deluge. Modern sequencing technologies have provided new means to access regulatory information in the cell, using ChIP-seq or related protocols. This yields genome-wide knowledge of, e.g. open chromatin regions or regions bound by transcription factors. Furthermore, reporter assays are being carried out at genome-wide scales, providing extensive information on sequences encoding regulatory functions. In all these situations, computational methods need to extract subtle signals from large amounts of sequence data that have been coupled with a functional read-out. Machine learning methods in general, and recent deep-learning approaches in particular, are the toolset that is put to use here. The PhD projects are devoted to this interface, leading students directly to the cutting-edge challenges of understanding gene regulation.
A key feature of early multicellular embryogenesis is that it consists of a branching process of essentially binary cell-fate decisions. In some cases, this involves the division of a cell into daughter cells with two separate, differentiated states. In other cases, one daughter cell retains pluripotency while the other becomes specified, as is the case for maintained stem-cell populations. These cell fate decisions are governed by regulatory interactions that underlie cell-type specific patterns of gene expression. Identifying the proteins and DNA regulatory elements involved in this process constitute a major aim of modern developmental genetics, but is complicated by the fact that developmental interactions occur within heterogeneous embryos and often involve cooperative, non-linear interactions between diverse proteins and DNA regulatory elements. These challenges are now being met by an increasing diversity of technical and computational approaches, including single-cell sequencing, CRISPR-based genetic manipulations, and improved computational modelling.
Within this aim, we hope to leverage the technological and computational advances that are the subject of HG and CB projects to annotate, explore, and perturb the regulatory interactions involved in developmental cell-fate specification. In the process, we aim to train a set of graduate students who are well-grounded in the principles of classical developmental genetics but are prepared to tackle developmental questions using a wide-variety of modern genetic and computational techniques.