ANGUS: Analyzing High Throughput Sequencing Data

July 2 - July 14, 2018

2017 materials:

Applications are open!

Applications will be accepted on a rolling basis until the workshop is full!

Please also see Frequently Asked Questions.

This intensive two week summer course will introduce attendees with a strong biology background to the practice of analyzing high-throughput sequencing data (Illumina, PacBio, and Nanopore). The first week will introduce students to computational thinking and large-scale data analysis on UNIX platforms. The second week will focus on genome and transcriptome assembly, transcript quantitation, variant calling, and other topics.

No prior programming experience is required, although familiarity with some programming concepts will be helpful, and bravery in the face of the unknown is necessary. A year or more of graduate school in a biological science is strongly suggested. Faculty, postdocs, and research staff are more than welcome, as are researchers from industry.

A draft schedule of hours for this year is available.

We plan to run multiple workshops of 20-30 participants each.

What will I learn if I attend?

Our goal for these two weeks is to get students to the point where they are ready to begin analyzing their own data on a computer cluster, and can work with help forums and online tutorials to advance their own skills.

Students will gain practical experience in:

  • Python and bash shell scripting
  • Cloud computing/Amazon EC2
  • Basic software installation on UNIX
  • Installing and running Trinity, BWA, Salmon, SPAdes, ABySS, Prokka and other bioinformatics tools.
  • Querying mappings and evaluating assemblies
  • Materials from previous courses are available at under a Creative Commons/full use+reuse license.

You can read a blog post about the 2015 course here:

The course fee will be $850 for this workshop.

Computer requirements

You will need to bring a computer that can connect to wifi, and you should have a modern browser (Google Chrome or Safari or Firefox) installed. No specific operating system is required.

We will use XSEDE Jetstream academic cloud computing to execute data analysis for the workshop; all analysis will be done remotely.


This workshop was run at Michigan State University’s Kellogg Biological Station from 2010 to 2016, with support from the USDA and NIH (see Funders). Dr. Brown is the founding course director and ran the workshop from 2010-2015; Dr. Staton (UTK) and Dr. MacManes (UNH) were the 2016 course directors. In 2017 the course moved to UC Davis with Dr. Brown, and we were able to expand to serve over 80 learners.

There are now almost 300 alumni of the first 8 years!

Applications are open!

Applications will be accepted on a rolling basis until the workshop is full!

If you have questions, please contact us at via e-mail at