Wed, 30 Dec 2009

My 2nd attempt at teaching Web Development Concepts


I just finished teaching Concepts in Database-Backed Web Development for the second time -- the post-mortem from the first course is here.

In the course, the students implement a reasonably complete HTTP server from the socket library on up, and integrate CSS, JavaScript (jQuery), and a little bit of databases into the mix.

This year I just about doubled the work load, making it a true "senior-level" course. Despite that, most students did pretty well, in part because I let students make up a HW's worth of credit at the end by implementing some additional features for a message board.

I only made a few changes in addition to the increased workload.

The primary change was that rather than focusing on an internal API of my own design for the interfaces they needed to implement, I used a subset of the WSGI interface, which meant that they had a "good" interface to work with, could use wsgiref to serve their WSGI apps, and could double-check my test WSGI apps against wsgiref to see what and how my code should work.

I also outlined a simple message board system (meeplib and meep_example_app) that I had them flesh out with registration, deletion, and persistence. This was so that they didn't have to put it all together themselves but would get to play with some semi-real code.

Successes:

Subversion worked way better this term. Last year it was a FUBARred mess; this year I had students develop in trunk and then 'tag' (using svn copy) their HWs -- you know, a semi-real developmental model...

I used a fast-food service skit (based on Culver's, if you must know) to demonstrate some of the basic issues involved in scalability on the server side. It was at least entertaining and a bit informative... and since it is very hard to actually demonstrate scalability issues in an intro course on Web development, I think I should limit myself to this kind of discussion.

Largely because of meeplib, I think students have a basic idea of how cookies, logins, authentication, persistence, redirects, etc. all work together.

At least two or three students who were marginal at the beginning pulled through quite nicely. Success!

Major failures:

I don't know how to teach databases. At all.

I didn't introduce the students to any real Web dev frameworks.

Students are generally lousy at writing clean code and I did not challenge them in this regard at all. More code reviews!?

I erred on the side of leniency with respect to sloppiness. I often gave students a chance to fix non-parsing code, missing files, etc. It turns out this does not discourage sloppiness. Next time I teach this, students will get a Big Fat Zero when they hand in something that doesn't function at all.

Major challenges:

Many students suck at solving their own problems, even when allowed to work in groups, use the Internet, etc.

Juniors, oddly enough, seem to do better in this course than seniors. I think our CSE program trains students to expect well-spec'ed out problems, and the cognitive dissonance from getting very loose instructions handicaps them.

Testing. Testing testing testing testing. We basically don't teach it, and I think I'm going to refocus the class around testing at the beginning. This leads to...

Reading code. Many students don't seem to be able to read code -- which is one of the most important skills you will need as a progammer.

Coding speed. I cringe at the thought of increasing the homeworks any more, but it's hard to cover the stuff I want to cover without doing so, and the students are (frankly) rather slow at coding. Partly this is because they don't know how to use the available tools well.

Concluding thoughts:

The more I teach, the less I know about teaching and the less capable I seem to be of teaching well.

These kind of "inductive" classes seem to be popular with a certain subset (60% or more?) of students, who love tackling "real" programming problems. A disjoint subset (~20%) seem to be incapable of dealing with this kind of problem.

Students need this kind of stuff. Mission of CSE.

20%/60%/20%

--titus

posted at: 10:50 | path: /dec-09 | 0 comments

Tags:


Sat, 26 Dec 2009

Some entertaining issues related to diversity


a.k.a. "why we should care:"

From the Comp Sci dept WIMS mentor:

  • "In terms of design teams and designing products, there's evidence from other industries that if you have just a male team, you could have a flawed product. Let's look at air bags, for example. Only 8 percent of mechanical engineers are female, and most of the teams working on air bags were predominantly male. When air bags were invented based on the male body as the norm, they ended up being potentially deadly to women and children. That's also happened with heart valves and voice-recognition systems; they were geared toward the male." (J. Margolis, in an interview at http://news.cnet.com/2008-1082-833090.html) Of course, a similar concern holds for teams with no minority representation.
  • Is there something in the genetic makeup of women and minorities that explains the lack of interest? Not likely. This was the argument that used to be made for why so few chose science, medicine, law, etc. decades ago. But these fields have made an effort to address some of the barriers to women and minorities and, as a result, their percentages have increased.

Separately, from Gizmodo via the Interesting People mailing list:

This is awkward. It appears that HP's new webcams, which have facial-
tracking software, can't recognize black faces, as evidenced in the
above video. HP has responded:

We are working with our partners to learn more. The technology we
use is built on standard algorithms that measure the difference in
intensity of contrast between the eyes and the upper cheek and nose. We
believe that the camera might have difficulty "seeing" contrast in
conditions where there is insufficient foreground lighting.

HP Face-Tracking Webcams Don't Recognize Black People - Hp - Gizmodo (21 December 2009)

What would you want to bet that that development team had only white people on it?

--titus

posted at: 17:57 | path: /dec-09 | 3 comments

Tags:


Tue, 22 Dec 2009

Why use buildbots?


I've recently turned my basilisk eye from Web testing and code coverage analysis to continuous integration, as you can see from my PyCon '10 talk and my UCOSP proposal, not to mention everyone wants a pony.

There's some confusion about what "continuous integration" means (see Martin Fowler on CI) so for simplicities sake I'm just going to talk about "buildbots" that take your code, compile it (if necessary), run all the tests across multiple platforms, and provide some record of the results. (This choice of terms is also confusing because "buildbot" is a widely used Python software package for CI. Sigh.)

why use buildbots?

So, uhh, why use buildbots, anyway?

  1. They build your code and run your tests without your conscious involvement.

Obvious, yes -- that is, after all, ostensibly the point of buildbots. But it has more benefits than you might immediately.

For this to work, you must have a systematized and automated build process.

You must also have some automated tests.

And your your build process and tests are being run on a regular basis, whether or not any particular developer feels like it. And if the build or tests fail, then more likely than not, something changed to make them fail -- and now you'll know.

These are all good and necessary things.

  1. They can build your code and run your tests in multiple environments.

buildbots can build and run your project on whatever operating systems you or your colleagues can access, and report the results to you, with a minimum of setup.

This is the main reason I use buildbots myself: to run tests on other versions of Python, and other operating systems. I'm a UNIX guy, and I develop on Linux; therefore my software usually works on Linux. My pure Python code generally works on Mac OS X, too, although I sometimes run into trouble with compiled code. But I don't ever run my software on Windows systems, because I don't have Windows handy; so my code often doesn't work on Windows. This is where a Windows buildbot comes in really handy, by catching the errors that I otherwise wouldn't even notice.

There's a more subtle point here that many people miss, which is the ability of buildbots to test dependence on a specific full stack of hardware and software. Most developers work with at most one or two build environments, including compiler or interpreter versions, operating system patchlevels, etc. The more different versions you have being tested, the more you can detect sensitivities to specific operating system or compiler or language features; whether or not cross-compiler or cross-version compatibility important to you is a different question, of course, but it's nice to know.

The most entertaining aspect of this is when buildbots detect when developers -- especially inexperienced ones -- introduce unintended or unauthorized new dependencies. "Hey, Joe, since when does our software depend on FizBuzz!?"

These latter points feed particularly into #3 and #4:

  1. They provide a de facto set of docs on your build & test environment.

buildbots require explicit build instructions, so if you've got one running at least your project has some form of build documentation. Not a good one, maybe not an explicit one, but something.

This is not a concern for most big open source projects, because they usually have fairly straightforward and well-documented build environments (although not all -- OLPC/Sugar was horrific!) Where I think this really helps is for small private projects and especially for for academic projects, where the level of software engineering expertise can be, ahem, poor. Having explicit build instructions that graduate student B can use to build & run the code now that graduate student A has left the project is quite helpful.

4. They are evidence that it is possible to build your code and run your tests on at least some platform.

You might be surprised how much some projects really need this kind of evidence :). As with #3, small private projects and academic projects benefit the most from this.

  1. They can run all the tests, even the slow ones, regularly.

This is the third reason that software professionals like continuous integration and buildbots: many tests (in particular, integration and acceptance tests) may take a loooong time to run, and developers may end up simply not running all of them. With buildbots, you can run them on a daily basis and detect problems, without distracting or defocusing your developers.

Are buildbots overkill for your project?

buildbots require setup and maintenance effort, which (in our zero-sum world) takes that effort away from developing new features, exploratory testing, etc. When does the benefit outweigh the cost?

Almost always, I believe.

For small side projects that you may not be constantly focused on, having the tests alert you when something breaks is really helpful. But even if you're in a mature software engineering setting and you have a good build process, a good set of documentation on how to build your software, and a commitment to running the tests regularly, many of the advantages above still apply. In particular, #1 (building w/o conscious effort), #2 (building across multiple environments), and #5 (running all of the tests, especially the slow ones), are advantageous for all projects.

I think buildbots aren't that useful for projects that are mostly UI (which is hard to develop automated tests for) or that are at a very early stage (where you're accumulating technical debt on a daily basis) or that depend on lots of specialized hardware. What else?

What's next?

I personally think that the technology that's out there in the Python world isn't that simple and hackable, so that's what I'm working on. I'd also like to minimize configuration and maintenance. I have a simple implementation "thought project", pony-build, that I'm hoping will address these issues. The goal is to make buildbots "out of sight, out of mind."

A secondary goal (one of many - watch this space) is to enable simple integration into a pipeline where patches can be tested, and/or automatically accepted or rejected, based on whether or not they pass tests on multiple platforms.

--titus

posted at: 12:34 | path: /dec-09 | 0 comments

Tags: , ,


Fri, 18 Dec 2009

Exhibiting aggressive competence


This last term I facilitated the participation of five MSU students in the Undergraduate Capstone Open Source Projects (UCOSP) program, in which students do distributed open source software development and receive home institution credit. UCOSP was managed out of U Toronto by Greg Wilson, and I was (and am) enthusiastic to participate as it's clearly a good way bring open source into education.

However, I was less thrilled to see that the majority of the MSU students received, ahem, "less than passing" grades from their project leaders. I knew about the problems in one particular project from having met with the students on a regular basis, but the other results caught me by surprise. I would love to kick and scream and complain that I should have been made more aware of what was going on -- and where I had constructive things to suggest, I did -- but the more important failure may have been a mismatch between the MSU students' approach to these projects, and project expectations.

The students variously had a number of problems, ranging from team miscommunication & poor conduct to an inability to get the software to compile. This meant that for several students, no visible work got done -- for example, in one project, it regularly happened that person X was working on a patch, and person Y committed an overlapping patch first. Or on another project, person Z spent two months trying to get the basic project infrastructure compiled, and was reduced (at the very end) to submitting code fixes without testing them in the full project context. Or several times, person A spent a week working out how to refactor a test into something reliable, and resulted in what looked like (and maybe was) a trivial code change.

All of these situations may result (and did result) in low evaluations. This is understandable: no visible work got done, so how is an evaluator supposed to grade them!? Yet, all of the situations are legitimate issues that block progress. What is a student to do?

The answer won't be too hard to guess for anyone who has worked on real-world team projects: make your struggles visible.

Someone steps on your patch? Fine -- submit your patch too, and explain why it's better (or worse) than the first patch. Code review the other patch, while you're at it: who better to do the review than someone who really understands the issues? Then when you get poor marks for not having contributed code, point at your patch. (You are using version control, right?)

Can't compile the software? Fine -- write down what's going wrong, and post it publicly. Document your fix attempts. Ask for help. Bash your head against the wall repeatedly. Either fix the problem, or document the problem thoroughly. Either someone will help you, or you'll figure it out, or you'll leave an audit trail so that others won't have to do all that fail work. Then when you get poor marks for not having contributed any patches, point out that the project has technical issues and either no one could help you (project FAIL) or you spent all your time fixing them.

Trying to debug niggling details that turn out (in the end) not to involve big impressive code changes? Submitting too many unimpressive patches that no one seems to value? Write down why your contributions are valuable. At the end of the day the evaluation may (rightly or wrongly) be "not too smart, but sure did work hard" -- but that's better than "no evidence of any work having been done".

Note how a lot of this seems to involve communication? Right -- that. For team projects, being an effective communicator is more important than being a kick-ass programmer.

At the end of the day, there are things you can control, and things you can't control. You can't control what other people think of you, and you can't control how other people (including project leaders and professors) evaluate you. But you can visibly work hard, and defend yourself based upon that evidence.

I call the general approach of throwing energy at a project "aggressive competence", and I think it's a necessary component of effective team software development. Everyone has days, or weeks, or even months where they look incompetent or ineffective; often that's because outsiders don't understand or appreciate the work that you've done. Tough on you, but I don't think it's reasonable to expect your boss, or colleagues, to look hard at your work to find reasons to praise you. Fundamentally, it's your responsibility to "manage up" and communicate your progress to others effectively.

In open source projects (and elective college courses) the immediate ramifications of a poor evaluation may not be clear -- I'll leave you, dear reader, to figure out the longer term consequences. But I think the ramifications of a poor evaluation are immediately obvious in the context of a capstone course, or a paying job.

Incidentally, this illuminates one of the reasons why I'm such a big fan of UCOSP: it is reality. You're working on an existing project, with other developers, at a distance; and it's not anyone else's responsibility to frame the problem for you. It's your responsibility to make progress.

This is where I think there were mismatched expectations. The students expected that they were going to be managed, helped, and given clear expectations. They weren't. So they got bad evaluations.

What do I plan to do? Well, assuming that UCOSP + MSU goes forward next term, I will be communicating my expectations quite clearly to the students. And I will be asking for regular progress reports, sent to me and CCed to the project leaders. And I'll be sending them this blog post. And I'll be failing the ones that don't listen.

I'll end with a paraphrase of one of my favorite sci-fi authors: "every new developer has problems on a new project. The extent of our sympathy for those problems, however, will be dictated by the efforts made to overcome them."

--titus

p.s. It's also a good way to figure what projects you don't want to work on: I once got dinged for working too hard in a company; I was told that I was "rowing too fast and the boat was going around in circles." My response (that perhaps others might consider rowing faster) was not received well. That's the kind of job situation you can leave without guilt (as I did).

p.p.s. Code reviews can be an extraordinarily effective passive-aggressive way to correctively interact with jerks on a project, too.

posted at: 11:27 | path: /dec-09 | 9 comments

Tags: , ,