Note: This is an invited blog post by Dr. Karen Word on our 2017 sequence analysis workshop, ANGUS. Our 2018 schedule has now been posted!

DIBSI Assessment: preliminary ANGUS report

There are two sorts of stories I find myself telling based on my own experience talking to people & reading through things:

1) This workshop catered to learners of many abilities, and no group was categorically dissatisfied. Satisfaction for these different groups was based on different outcomes, e.g. for novices building a mental model & confidence was considered success; for practitioners validating practices and acquiring new tools, tricks, and contacts was more likely to constitute success.

2) From our perspective, a desirable outcome would be the creation of a community within which learners can support each other in solving problems to minimize the strain on experts. Learners largely did not seem to be seeking this kind of community -- even where they valued socialization, it was rarely with an eye towards technical support. The exception seemed to be active practitioners who had already been seeking support within their own community and were able to understand other learners as a resource for help.

Basic demographics:

Note: demographics were collected in the pre-assessment only. Individuals who did not take the pre-assessment are not represented here.

The "other" category above was selected mainly by postdocs, but also includes DVM, MD, BS-holding professional, and one "Assistant professor." This is a similar mixture as those cited under "PhD-holding professional" hence with few exceptions these categories could be combined (yellow & green) indicating similar representation of advanced degree-holders relative to graduate students.

A few response summaries:

People were overwhelmingly satisfied with the workshop:

Curiously, even learners who indicated that they left with "no ability" to install and run genome assembly software still largely felt that their needs had been met:

Other metrics are similarly supportive. 89% of respondants indicated that they learned what they hoped to learn, and 96% say they would recommend the workshop to colleagues.

We asked two open-response content questions on the pre- and post-assessments. One of them was as follows:

Suppose that you are using Illumina to sequence DNA from a mouse sample that should have genetic differences from the mouse reference genome. Discuss one or more approaches you would take to analyze the data, as well as your expected sensitivity to SNPs and indels. Include in your discussion how much you will miss, and how much you find that will be wrong.

This word cloud shows the most common terms from pre-assesment responses to the question above, scrubbed of most terminology present in the question:

vs. Post-assessment responses to same, reflecting common details acquired:

The figure below shows pre vs. post self-evaluation of ability in various relevant subject areas. Substantial changes are evident in all categories except Python programming, a skill that is not directly taught in the workshop.

The figure below shows more specific responses to "I know how to" or "I understand" questions on a Likert-type scale. Progress is evident in all categories, again with python scripting ("Qscriptpy") taking up the rear. The greatest progress is in knowing what relevant tools do.

A few quotes:

In response to "please comment on the extent to which you learned what you expected to learn":

Success as defined by a learner who does not plan to analyze their own data:

1 . I wanted to understand the generic pipeline for analyzing RNA sequencing data- SUCCESS 2. I wanted to learn the fundamental skills needed to use tools in the shell/R- SUCCESS 3. I wanted to feel comfortable talking with bioinformaticists about how they performed analyses, so I can be sure that they did what I need them to do- SUCCESS Additionally, I received a great introduction to the concepts of "open science" and "making stuff actually reproducible," things which are not emphasized in my program!

Success for learner who has data that have previously been analyzed, but wants to become more independent:

I came with zero to none programming skills and had no idea how to get started with analyzing next-gen sequencing data. I was sitting on RNA-Seq datasets that someone else had analyzed for me and felt so powerless and overwhelmed. Since attending the workshop, I was able to get my own Jetstream allocation, move my data into the new server, do genome assembly --> Diff gene expression all on my own, while generating more than decent plots using R. I feel immensely relieved that I have all the basic resources I need to get started on exploring Bioinformatics and all the possible research directions that can open up as a result of this. I also met some freaking awesome people, helping build a network of skilled Bioinformaticians I can reach out to if I ever need advice/help. For someone coming from an obscure univ which doesn't have any bioinformaticians at all, this is huge. Thank you DIBSI organizers!!

Another learner who wants to analyze their own data and came in with some computational skill:

I work with non model organism and this workshop turned out to be great to answer my queries and how to tackle genome and rna seq data analysis of such organisms. Also, the knowledge of tools used would further help me perform assembly and analysis of my own data set. I feel way more confident in running the tools for analysis after participating in this workshop

In response to "Please comment on how your understanding of computational science changed":

The perspective of an advanced learner:

This workshop pushed me from being an adequate programmer who can google things and piece together a decent pipeline to get the necessary data, to understanding what's going on under the hood at each step and analyze data quality much better.

And another:

I had some prior experience of bioinformatics tools used to analysis rna seq and DNA seq data, but this workshop further helped me with understanding new tools such as R studio , markdown and writing own scripts.

Whereas from people reporting lower computational skills, one benefit was gaining focus on what was important or learning what they did not need to know:

My initial thought was that I had to obtain a year long computational course in order to be able to assess genetic data. With this course I learned that all I need is the tools specific to proceed with the process of mapping, and analyzing data. I Was also impressed that with working on the cloud instance there is no need for a huge memory computer. And many people spoke of reducing a barrier of fear for computational processes and/or seeking help: I have a base understanding now of what forms the data comes in and how to appropriately prepare the reads for downstream analyses. There are many steps in the pipeline and many options for programs. I do feel that I have more confidence to try to tackle analysis of my own data and to write scripts. Accessing the cloud or shell used to be very intimidating for me, but I feel more comfortable now. Most importantly, I have gained enough knowledge to actually be able to ask the appropriate questions of my systems admin and colleagues that are more computationally inclined when I get stuck on certain analyses.

Complaints & Suggestions:

A request for consistent embedded pipeline visualization recurs in various responses (this was discussed in Tigers room, possibly explaining this specific trend). Other learners referred to wanting more "big picture" introductions or wanting to know "why" things were being done.

Several people would also like to get more practice with the tools or more opportunity to work with their own data.

One person observed that we target computational novices well but not necessarily genetics novices.

There were two individuals who indicated that they would not recommend this workshop to colleagues. One of them indicated that they do not have colleagues who work with NGS data. The other indicated objections to instructional style and a sense that the workshop was "too casual" and lacked organization. However, elsewhere they seem almost to be responding on behalf of less experienced learners, stating: "I learn some good tricks and got some good tricks. If this was my first workshop, I will felt lost after the first week."

A few pertinent recommendations:

I suggest articulating plans for advance embedding of formative assessessment in the curriculum since:

Formative assessment can functionally provide practice with the tools in small ways -- while not providing exactly what is requested, this may resolve the feeling that learners have not "played" with the tools.
Formative assessment can also raise "why" questions to prompt discussions of broader connectivity as necessary

It would be relatively straightforward to provide a pipeline roadmap and a vocabulary list (or glossary) for computational and genetic terms and abbreviations. (I suggest having these on paper to avoid adding to overcrowded screens)

We had very few people who reported lacking background in genetics, but we also did not survey for this directly. We should consider whether ths is something we plan to address, and consider adding language in the course description if it is not.

Finally, regarding assessment in future years, I recommend that we more directly inquire as to the goals that attendees have coming into the workshop. I also suggest that we ask about ways in which their home community expects to rely upon their training. Given that success appears to take such different forms for different learners, this would help us to more precisely assess the extent to which we are meeting those varying needs. Our plans for retrospective surveys and interviews will also help tease apart the impact that these different kinds of experiences have on careers and communities in the long term.

Living in an Ivory Basement Stochastic thoughts on science, testing, and programming.

Assessment report for ANGUS 2017