3DAROC18: 3C-based data analysis and 3D reconstruction of chromatin folding

3DAROC18

3C-based data analysis and 3D reconstruction of chromatin folding


IMPORTANT DATES for this Course

Deadline for applications: Aug 31st 2018
Course date: Sep 17th to Sept 21st 2018

Candidates with adequate profile will be accepted in the next 72 hours after the application, until we reach 20 participants.


Instructors:

Marc A. Martí-Renom obtained a Ph.D. in Biophysics from the Universidad Autonoma de Barcelona where he worked on protein folding under the supervision of B. Oliva, F.X. Aviles and M. Karplus. After that, he went to the US for a postdoctoral training on protein structure modeling at the Sali Lab (Rockefeller University) as the recipient of the Burroughs Wellcome Fund fellowship. Later on, Marc was appointed Assistant Adjunct Professor at UCSF. Between 2006 and 2011, he headed of the Structural Genomics Group at the CIPF in Valencia (Spain). Currently, Marc is an ICREA research professor and leads the Structural Genomics Group at the National Center for Genomic Analysis - Centre for Genomic Regulation (CNAG-CRG) in Barcelona. His group is broadly interested on how RNA, proteins and genomes organize and regulate cell fate. Finally, Marc is an Associate Editor of the PLoS Computational Biology journal and has published over 90 articles in international peer-reviewed journals.

Affiliation: Centro Nacional de Análisis Genómico (CNAG) and Center for Genomic Regulation (CRG), Barcelona, ES

François Serra obtained his Degree in Biology, specialized in Physiology and Neurophysiology, his Master's Degree in Structural genomics and bioinformatics (Strasbourg I University, France) and it's PhD in Evolutionary Genomics in the Department of Bioinformatics at the CIPF (Valencia). He is now part of the Structural Genomic team of Marc A. Martí-Renom at CNAG-CRG (Barcelona). His main research interests are grounded on comparative genomics and evolution with a special focus on the effect of evolution in the structural arrangement of genomes. He has taught MEPA and 3DMOG for GTPB, and also in similar courses at CIPF (Valencia, ES) and the Department of Genetics of the University of Cambridge (UK).

Affiliation: Centro Nacional de Análisis Genómico (CNAG) and Center for Genomic Regulation (CRG), Barcelona, ES

David Castillo obtained his MSc in Photonics from the Universitat Politècnica de Catalunya in Barcelona (Spain) where he worked in Super-resolution microscopy. He has a background in Physics and Engineering. He works as a technician in the Structural Genomics team of Marc A. Martí-Renom at CNAG-CRG (Barcelona), developing tools for the analysis, modelling and visualization of HiC data. He is also interested in the integration of microscopy to the modeling of genomic 3D structures.

Affiliation: Centro Nacional de Análisis Genómico (CNAG) and Center for Genomic Regulation (CRG), Barcelona, ES

Course description

3C-based methods, such as Hi-C, produce a huge amount of raw data as pairs of DNA reads that are in close spatial proximity in the cell nucleus. Overall, those interaction matrices have been used to study how the genome folds within the nucleus, which is one of the most fascinating problems in modern biology. The rigorous analysis of those paired-reads using computational tools has been essential to fully exploit the experimental technique, and to study how the genome is folded in space. Currently, there is a clear expansion on the wealth of data on genome structure with the availability of many datasets of Hi-C experiments down to 1Kb resolution (see for example: http://hic.umassmed.edu/welcome/welcome.php ; http://promoter.bx.psu.edu/hi-c/view.php or http://www.aidenlab.org/data.html). In this course, participants will learn to use TADbit, a software designed and developed to manage all dimensionalities of the Hi-C data:
  1. 1D - Map paired-end sequences to generate Hi-C interaction matrices
  2. 2D - Normalize matrices and identify constitutive domains (TADs, compartments)
  3. 3D - Generate populations of structures which satisfy the Hi-C interaction matrices
  4. 4D - Compare samples at different time points
Participants can bring specific biological questions and/or their own 3C-based data to analyze during the course. At the end of the course, participants will be familiar with the TADbit software and will be able to fully analyze Hi-C data.

Note: Although the TADbit software is central in this course, alternative software will be discussed for each part of the analysis.

Target Audience

The course design is oriented towards experimental researchers and bioinformaticians at the graduate and post-graduate levels. The last edition of this course was attended by people with different backgrounds and interested in the genome organization.
Moreover, Hi-C data have recently been used in metagenomics studies to accurately cluster metagenome assembly contigs into groups that contain nearly complete genomes of each species.
It is likely that the participants to this course aim at getting involved in generating Hi-C data for chromosome structure determination or that they just want to be able to correctly interpret and analyse publicly available data.

Course Pre-requisites

Recommended Linux and basic Python programming skills, graduate level in Life Sciences.
All hands-on will be given at 3 levels of computational expertise (web platform, command-line tool and python scripting).

TADbit API

This tutorial is associated with a specific version of TADbit. if you wish to reproduce exactly the results you should use the version of TADbit tagged 3DAROC_2018.

To install this version, please issue these commands:

git clone https://github.com/3DGenomes/TADbit

cd tadbit

git checkout tags/3DAROC_2018

sudo python setup.py install

TADbit tools

Most of the tasks of the "core pipeline" can be tunned directly from command line (without any python), using TADbit tool. Have a look to the commands, and the metadata of the results.

For now TADbit tool is not incuded in the general documetation, as it is still under active development. Use it carefully, and don't hesitate to repport any unexpected behaviour you observe.

Virtual research environment

With small datasets TADbit core pipeline can be runned through a new Virtual Research Environment (VRE), hosted by the MuG project. This might also be the best place to try TADkit for visualizing genomes in 3D together with interactions matrices and any other genomic track.


Applications

Detailed Program

Support and sponsorship

Instituto Gulbenkian de Ciência,

Apartado 14, 2781-901 Oeiras, Portugal

GTPB Homepage

IGC Homepage

Last updated: July 14th 2018