This term, I'm once again teaching my upper-division CSE undergrad
course in Web Dev here at MSU. For the second time, I'm requiring students
to use github for their homework; unlike last year, I now understand pull
requests and have integrated them into the process.
How does it work, basically?
The instructions for handing in homework are as follows:
Merge hw3 into your master. Please don't delete the 'hw3' branch :)
Hand in this homework on branch 'hw4' on your github account, and
set up a pull request between hw4 and your master branch. Don't merge
it, just set up the PR.
This sets up a standing pull request for HW 3 that lets me (and other students
-- more on that in a bit) comment on the commit.
When I go to grade homework, I go into a checkout that I keep of each
student's directory; importantly, it's named for their github repo.
Here I can easily do:
git checkout hw4
and get their HW 4.
Next, I run stuff, fix minor bugs, comment on code, commit changes,
and otherwise interact with their homework. I can then push my changes
and comments to a branch on my github repo:
git push firstname.lastname@example.org:ctb/cse491-serverz.git hw4:hw4-$dir-comments
and create a pull request automatically by constructing the right URL:
open -a "/Applications/Google Chrome.app/" https://github.com/ctb/cse491-serverz/compare/$dir:hw4...ctb:hw4-$dir-comments?expand=1
I typically send a private e-mail with their grade in it at this point, along
with any requests or general comments I have (fix your tests, think about
Note that leaving the hw3 branch lying around lets me backtrack and check
old code more easily. Sure, I could do this with 'git log' but it's
nice to have the branch there.
Code sharing policy
The github repos are public, which presents a challenge: how do you prevent
Well, my general policy in this class has always been that "code
should work; I don't care where you got it." My experience has been
that some number of students simply fail to even copy code effectively,
and that level of evaluation is sufficient for me.
That having been said, this year I now require that students have their
own names on the commits -- they can't just completely swap in someone
else's commits. This is antithetical to the whole version control and
credit-where-credit-is-due thing but I found that git made it too easy
to copy functioning code. I'll let you know how this goes.
Grading only a subset of HWs
Grading this stuff is quite a chore; I can do 5-10 HWs in a single
sitting, but ~40-45 is too much. I don't hand this off to TAs -- I
prefer to look at the code myself, and I also generally find that my
TAs don't know the material that well (they're usually my grad students
but that doesn't mean they're web devs!)
So, this year, inspired by a post on Tomorrow's Professor, I am only
grading 5-10 homeworks a week. I am choosing them more or less
randomly, and I expect to grade each person's work a few times during
the term. As the term goes on I will target people who need a bit
more help. I do take requests, too, so if people want my feedback I'm
sure to look at their code.
So how do I assign grades to the ones I don't grade? The rule is that
if I don't grade your HW, you automatically get 100%. However, a key
component of this is that you must have handed in something
approximating the homework, and it must have been handed in on time;
if I find a bunch of empty HW branches (and I _will_ notice), the student
will be in trouble.
Thus far I like the way this is working out. I've been able to do a
much better (less hurried, better comments) job of grading than in
previous years, and of course the students don't mind. Because the
homework is cumulative, I expect this approach to still effectively
sort the students.
I use tests (nose) and code coverage (Ned Batchelder's excellent
coverage module ) to
target my grading. If there are uncovered bits of code, that's the
first place to look for bugs. I also do some exploratory testing (aka
"clicking on things and waiting for stuff to break) and skim the code,
but targeting missing code coverage is an amazingly effective way
to find bugs.
I am planning to set up continuous integration on each repo and pull
request, largely as a tool to help the students. I'll let you know
how that goes.
To find people who may not have handed in HW, I can run a little
shell script, check.sh
for i in $(cat github-list.txt)
echo $i $(git branch -r | grep hw4) >> ../out
that lets me see who does not have a hw4 branch.
I'm thinking about writing a little RSS feed examiner to see who has
turned in stuff, too. Probably redundant at this point.
Enabling code review, and a crazy idea
I'd really like to get the students into code review, but it's remarkably
hard to take people who are not that experienced at software development and
get them to critique each other's code. I don't scale particularly well,
and so I can't really help the students learn to do this 1-1. So... how?
One thing I'm trying out is to have students do code review in class,
on each others' pull requests. This works ... OK, but there's not really
enough time in class to do it.
What else could we do?
Since I have a range of student expertise in the class, why not get
the more expert students to help out? It will give the more expert
students some code review experience, and it will help the less expert
students work through problems.
Here's my idea:
- if a student is having trouble, they can request an extension.
- as part of the extension, I will assign them a mentor from a
pool of people who are generally doing well.
- the mentor will go through their pull request and make comments.
- the mentee will then be able to work through the problems, potentially
with back-and-forth with the mentor.
- the mentee gets an extension and probably a better grade; the mentor
gets project credit towards their final grade.
This can be gamed, and undoubtedly will; in particular, I expect
people who are perfectly capable of getting it to work but who just
want an extension to ask for one. But do the benefits significantly
outweigh the loss from gaming? I'll let you know how I feel about it
Suggestions on modifications welcome, of course. I'm thinking that
there will be a non-zero chance that I will refuse the extension
request, which should keep people on their toes. Other thoughts?
What isn't working?
The students are still fairly new to git, and I haven't been forcing
them to do tricky stuff with it -- as we know from Software Carpentry,
git is very difficult to learn, so I'm taking it slow. (This is the
students' first introduction to distributed version control, and I don't
think many of them are really experienced at version control, period;
we don't introduce it at MSU until 3rd year!)
This leads to one big problem -- the pull requests often contain
significant extraneous code, like entire virtualenvs; and sometimes
the PRs aren't useful for other reasons, like someone swapping in a
whole bunch of new code. But this is still a fairly rare problem.
This is all an experiment, but so far I've been impressed with how
well git and github are working. I'd love to hear about other
experiences; drop me a line or comment below.
In previous years, I've gotten feedback that students really find the
testing and continuous integration stuff useful for their next jobs.
This year, I hope to get the same comments about git and github.
There are comments.