Programming in Python for Biologists

Downloadable poster in PDF

   IMPORTANT DATES for this Course
   Deadline for applications: Nov 10th 2017
   Latest notification of acceptance: Nov 12th 2017
   Course date: November 27th - November 30th 2017

Candidates with adequate profile will be accepted in the next 72 hours after the application until we reach 20 participants.


Allegra Via is a physicist and scientific researcher at the Institute of Molecular Biology and Pathology (IBPM) of the National Research Council (CNR, Rome, IT). In 2003, she obtained a PhD at the University of Rome "Tor Vergata", where she also worked for six years as a postdoc. In 2009 she moved to the "Sapienza University" as researcher, and, since January 2014, she is the Training Coordinator of ELIXIR Italy. She is involved in the design, organisation and delivery of bioinformatics training courses, in Train the Trainer activities where she co-heads the Train-the-Trainer subtask in the EXCELERATE project, and collaborates with other ELIXIR nodes on many training-related initiatives. She has a long track record of academic teaching (Macromolecular Structures, Python programming, Bioinformatics, Biochemistry, Protein interactions). Her main research interests include protein structural bioinformatics, protein structure and function prediction and analysis, and protein interactions. Additionally, she is strongly interested in what researchers have discovered about how people learn, how to do the best to best teach them, and how research findings in the science of learning (educational psychology) can be translated into common teaching practice. She is a member of the Global Organisation of Bioinformatics Learning, Education and Training (GOBLET), a Software/Data Carpentry Instructor and Instructor trainer. She collaborates in GTPB since 2010 as a course designer and instructor, specifically in the provision of face-to-face courses on Programming in Python.

Affiliation: Institute of Molecular Biology and Pathology (IBPM) of the National Research Council (CNR, Rome, IT)

Pedro Fernandes graduated in Electronics and Telecommunications Engineering at IST (U.T. Lisboa). He worked in Biomedical Engineering, Biophysics and Physiology and changed to Bioinformatics in 1990. He established the first user community in Portugal around the national service provided by the portuguese node of the EMBnet. In 1998 he created the Gulbenkian Training Programme in Bioinformatics, that provided user skills to more than 5000 course attendees throughout its eighteen years of existance. In 2002, in cooperation with Mario Silva from FCUL, he designed a graduate Programme in Bioinformatics. He currently teaches Bioinformatics both in graduate and undergraduate programmes.

Affiliation: Instituto Gulbenkian de Ciência, Oeiras, PT

David P Judge is a Computer Scientist who has taught Bioinformatics since 1985. He initiated the University of Cambridge Bioinformatics Training Facilty, providing the necessary environment for graduate and undergraduate courses. He has also been involved with training programmes at the European Bioinformatics Institute, the Wellcome Trust Sanger Institute and the Instituto Gulbenkian de Ciência (IGC) through GTPB. He currently teaches Bioinformatics in several international training programmes and is regularly invited to teach in many places in Europe, Asia, Africa and America. His course notes and exercises are well known to the international community of Bioinformatics professionals and users, many of whom (difficult to count) have had their first contact with Bioinformatics through him. David helped to pioneer Bioinformatics Training at the IGC in 1989, as part of an MSc course. Since its inception in 1999, he has contributed to the GTPB several times per year.

Affiliation: Freelance independent Bioinformatics instructor.
Former manager of Bioinformatics Training, University of Cambridge, Cambridge, UK

Course Description

Python is a programming language that is ideal for data processing. The course will start from zero knowledge, and will introduce the participants to all the basic concepts of programming such as calculating, repeating things, making choices, reading and writing files, filtering and organising data, program logic and writing larger programs. The examples and practical sessions will mostly focus on managing biological data.

In particular the sessions will cover:
- parsing common file formats (Uniprot, GenBank, PDB, BLAST)
- data retrieval from files and their manipulation
- finding motifs in sequences
- manipulating tables
- plotting data

The course will be highly interactive and the participants will continuously put theory into practice while learning. At the end of the course, the participants will have a good understanding of programming basics and will have acquired the skills to manage bioinformatics database record and data files. Basic Unix/Linux skills will be provided at the beginning of the course. Participants are welcome to bring to the course the "typical" text file that they may need to filter, manipulate or analyse such a table, a file in fastq or fasta format, a PDB file, etc.

Target Audience

End-users of bioinformatics databases and tools that need to manage large files and/or a large number of files and aim at developing hands-on capabilities for their analysis by writing their own or adapting somebody else's scripts in an autonomous way.

Course Pre-requisites

Basic familiarity with bioinformatics data resources such as Uniprot/Swiss-Prot, Blast, ENSEMBL, PDB, etc. The course is directed to biologists with little or no programming experience and aims at making them capable to use Python to autonomously manage and analyse biological data.

Detailed Program

Instituto Gulbenkian de Ciência,

Apartado 14, 2781-901 Oeiras, Portugal

GTPB Homepage

IGC Homepage

Last updated:  Oct 15th 2017