ARANGS16

Automated and reproducible analysis of NGS data

Timetable (provisional)
Date Automated and reproducible analysis of NGS data
Wed, May 9th
Day #1
09:30 - 11:00 Introduction, Installation
Introduce the course concepts, and the basic pipeline and data while students download the required software if they have not already done so.
11:00 - 11:30 Coffee Break
11:30 - 12:30 Introduce the pipeline software requirements, and discuss its compute environment
Using the pipeline as a launching point, introduce the Linux operating system variants, and discuss their different package management systems. Demonstrate the breadth of internet resources availble for students to find out how to install the packages they need.
12:30 - 14:00 Lunch Break
14:00 - 16:00 First run of the pipeline
Participants will attempt to use the documentation provided to get their local machines ready to run the pipeline. We will discuss pain points along the way. Instructors will run the pipeline to demonstrate that it works, and the output that is expected.
16:00 - 16:30 Tea Break
16:30 - 18:00 Continue running the pipeline, note any different versions of software on different machines.
Tue, May 10th
Day #2
09:30 - 11:00 Morning Wrap-up (what have we done so far?), Virtualization 101, Virtualbox
Review day 1, then dive into Virtualization with Virtualbox.
11:00 - 11:30 Coffee Break
11:30 - 12:30 Adding Vagrant to your virtualization toolbox
Participants will learn how to use vagrant to automate and manage their research pipeline projects. They will learn about the Vagrantfile syntax, and the location of free base boxes from which they can start to build the machine environment for the pipeline. They will also use package management applications to install the software needed to run the pipeline on their Virtual Machine.
12:30 - 14:00 Lunch Break
14:00 - 16:00 Adding Puppet to your virtualization toolbox
Participants will learn about configuration management, and start to use Puppet to automate the process of provisioning their pipeline virtual machine.
16:00 - 16:30 Tea Break
16:30 - 18:00 Run the pipeline inside a virtual machine image provisioned with puppet
Participants will finish provisioning their pipeline virtual machine with puppet, and run the pipeline inside the virtual machine.
Wed, May 11th
Day #3
09:30 - 11:00 Morning Wrap-up (what have we done so far?), Vagrant boxes
Participants will learn how Vagrant can be used to store and share the exact machine image that they created, with specific versions of the software required to run their pipeline, and a standard directory structure.
11:00 - 11:30 Coffee Break
11:30 - 12:30 Host machine directory mounts
Participants will learn how to mount local directories onto their Virtual Machine. They will learn about how data from different projects can be plugged into their virtual machine in place of the directory structure it expects.
12:30 - 14:00 Lunch Break
14:00 - 16:00 Introduce Docker
Participants will learn how Docker builds off virtualization with a different approach. They will learn about the docker toolset: docker, docker-machine, and docker-compose. The session will end with a discussion of docker concepts:
  • Container Applications compared to Virtual Machine Images
  • Volume Containers to permenantly store and share data
  • Volumes to plug different host directories into the directory structure expected by a container application
16:00 - 16:30 Tea Break
16:30 - 18:00 Docker Machine, and Docker Commandline
Participants will learn how virtualization is leveraged to produce a standard machine environment on any host machine within which docker images can be stored and docker containers can be run. Participants will create their docker machine, and learn how to configure the Docker commandline application to work with it. Participants will also learn how the Docker commandline's preconfigured knowledge of the global docker hub registry can speed up their use of docker to design their environments.
Thu, May 12th
Day #3
09:30 - 11:00 Morning Wrap-up (what have we done so far?) Dockerfile and the Docker build context
We will learn how to create a Docker build context to automate and document the way we build the machine environment.
11:00 - 11:30 Coffee Break
11:30 - 12:30 Docker Compose
We will learn how to use docker-compose to automate the way we build and launch pipeline components. We will use docker-compose as we continue creating the pipeline image build contexts.
12:30 - 14:00 Lunch Break
14:00 - 16:00 Run the pipeline
We will run our pipeline using docker-compose. We will learn about how docker-compose can automate volume mounts to host locally stored data, and how its logging system can help debug their applications.
16:00 - 16:30 Tea Break
16:30 - 18:00 Share the pipeline
We will learn how to use Docker to store pipeline machine images, and share them with each other for reuse.
Fri, May 13th
Day #3
09:30 - 11:00 Morning Wrap-up (what have we done so far?) Sharing your pipeline machine environments with the world
We will begin to learn how to use Vagrant to share machine images with the rest of the world.
11:00 - 11:30 Coffee Break
11:30 - 12:30 Sharing your code and Vagrantfile
We will get a brief introduction into how we can use github to share the code and Vagrantfile to share a machine image configuration with the world.
12:30 - 14:00 Lunch Break
14:00 - 16:00 Sharing your docker images, code, and build contexts with the world
We will learn how we can use the worldwide docker hub registry to share docker images with the world. We will learn how to store code with the Docker build contexts needed to build the images needed to run the code in github.
16:00 - 16:30 Tea Break
16:30 - 18:00 Final Wrap-up session
Course Homepage

Instituto Gulbenkian de Ciência,

Apartado 14, 2781-901 Oeiras, Portugal

GTPB Homepage

IGC Homepage

Last updated:   Mar 2nd 2016