Note: we were just awarded this allocation on Jetstream for DIBSI. Huzzah!
Abstract:
Large datasets have become routine in biology. However, performing a computational analysis of a large dataset can be overwhelming, especially for novices. From June 18 to July 21, 2017 (30 days), the Lab for Data Intensive Biology will be running several different computational training events at the University of California, Davis for 100 people and 25 instructors. In addition, there will be a week-long instructor training in how to reuse our materials, and focused workshops, such as: GWAS for veterinary animals, shotgun environmental -omics, binder, non-model RNAseq, introduction to Python, and lesson development for undergraduates. The materials for the workshop were previously developed and tested by approximately 200 students on Amazon Web Services cloud compute services at Michigan State University's Kellogg Biological Station from 2010 and 2016, with support from the USDA and NIH. Materials are and will continue to be CC-BY, with scripts and associated code under BSD; the material will be adapted for Jetstream cloud usage and made available for future use.
Keywords: Sequencing, Bioinformatics, Training
Principal investigator: C. Titus Brown
Field of science: Genomics
Resource Justification:
We are requesting 100 m.medium instances with 6 cores, 16 GB RAM, and 130 GB VM space each for each instructor and student for 4 weeks. The total request is for 432,000 service units (6 cores * 24 hrs/day * 30 days * 100 people). To accommodate large size data files, an additional 100 GB of storage volumes are requested for each person. Persistent storage beyond the duration is not necessary for this training workshop.
These calculations are based on running the course for seven years with approximately 200 students total over the past six years on AWS cloud services.
Syllabus:
http://angus.readthedocs.io/en/2016/
Resources: IU/TACC (Jetstream)
Comments !