Thu, 27 Dec 2007

It's not the lines of code, dummy.


Steve Yegge recently wrote a long article, "Code's Worst Enemy", about how "many lines of code" causes problems in projects.

That's obviously pretty silly. To see why, let's examine a little project I've recently started; conservatively, I estimate that it incorporates well over a million lines of code:

print 'hello, world'

Well, that's one line.

But what's needed to run it? The Python interpreter; the C compiler (to build the Python interpreter); the libraries necessary to run Python and actually make that statement appear on the screen; and the Linux (or Mac O$ X, or Window$) operating system and drivers needed to bind them all together.

There's easily a million lines of C code in there, if not ten million. So have I just coded on the most bloated, worst project of all time?

Nope. It's not the lines of code that matter. It's the lines of code you need to think about that matter.

When I write Python code, I rarely need to worry about anything other than the code I'm writing. I don't need to think about the std lib all that much, I certainly don't worry about the CPython core code, and I touch on the UNIX kernel very infrequently. Why?

Because all of that other code is nicely encapsulated, behaves in expected ways, and rarely breaks.

And this is why having a full language, good libraries, and a reliable OS are all ways to decrease the "brain load" of your software.

I also think it points to a deep truth of software engineering, which is that a good library API is one that you don't need to think about much. A good library should be compact, inclusive of core features, work reliably, and contain functionality orthogonal to your code. All of these things will help you worry about the core functionality of your own code and not about any other code.

By almost any measure (excepting that of life itself) our software is unimaginably (and unmanageably) complex already. We manage that complexity as best we can by encapsulating functionality in libraries, APIs, protocols, and "expectations" that are fulfilled, more often than not. And that's the real lesson you should take away from Yegge's post: that writing a 500k Ball of Mud is a bad idea indeed, but more because his design process failed well before he got to 500k LoJC.

--titus

P.S. JavaScript? Really? This reminds me of Phil Greenspun telling me how great Tcl was as a way to develop a large framework -- and that didn't end well...

posted at: 18:03 | path: /dec-07 | 6 comments

Tags: ,


Sun, 16 Dec 2007

Beautiful code? Naah, give me *readable* code.


Andrew Binstock hits the nail perfectly on the head with his post, Beautiful Code vs Readable Code.

--titus

posted at: 14:03 | path: /dec-07 | 0 comments

Tags: ,


Sat, 29 Sep 2007

Software Licensing


This month the newly minted biology-in-python mailing list erupted into a discussion of licenses. There was some confusion about the goal of the discussion, for which I'm largely responsible: we didn't make it clear that we were talking about licenses for code and content posted on the bio.scipy.org community Web site, so people were worried that we were trying to dictate license choices for all Python/bioinformatics software! Not at all! Anyway, I'm happy with the decision that we've posted, which is to place tutorial/example code under the BSD license, and discussion under Creative Commons/attribution.

A number of really interesting posts came through on this subject: Bruce Southey posted a number of interesting links, including Would Dostoevsky use the GPL? and Maintaining Permissive-Licensed Files in a GPL-Licensed Project. Josh Wilcox posted about a "grace period" hack in which, to quote,

In addition to the terms of the GNU General Public License, this
licence also comes with the added permission that, if you become
obligated to release a derived work under this licence (as per
section 2.b), you may delay the fulfillment of this obligation for
up to 12 months ("grace period").  If you are obligated to release
code under section 2.b of this licence, you are obligated to release
it under these same terms, including the 12-month grace period
clause.

This is an interesting idea but I have no idea if, in this case, companies would care at all: we're talking about tutorial and example code here, not real software.

I also found it extremely interesting to watch the dynamics between the free-as-in-beer and free-as-in-speech people. I'm currently willing to release software under either license -- I relicensed twill from GPL to BSD with the last release, for example -- but I have very little sympathy with the idea that companies should be able to take my code, close it, modify it, and resell it. Nonetheless I understand that competing ideologies exist and I'm willing to accomodate them as best I can. (Conveniently for my leanings both of the universities I work for, Caltech and MSU, demand that work-related software be released under the GPL.) Watching people consistently misrepresent their positions -- I assume they did so knowingly -- as "the GPL is free-er than the BSD!" and "the BSD is free-er than the GPL!" -- was very interesting and informative. (I would put it this way: the GPL restricts software use in specific ways, with the ultimate goal of increasing the freedom to use all derivatives of that software.)

Anyway, the list conversation got a bit long, and after I received a number of complaints about how annoying the list was becoming, I ended the discussion by fiat: I declared that we should either stop discussing licenses for 6 months, or we should move ongoing discussion to a new list. Enough people expressed interest that I created a new list, bip-admin, to contain further admin discussion for bio.scipy.org. (I realized later that meta-bip would have been a better name. Alas, renaming mailman lists is not trivial.)

One of the most frustrating things about the license discussion was that we really have no content whatsoever on bio.scipy.org, and here we were discussing how to handle all of this nonexistent content rather than writing some! I am both amused and horrified at the ability of people (including myself!) to talk about procedure and protocol endlessly while failing to actually do useful work. I guess it's the human condition -- heck, Og and Boog probably argued about the proper protocol for deciding whose turn it was to go get more firewood, back when we lived in caves...

--titus

posted at: 13:37 | path: /sep-07 | 2 comments

Tags: , ,


Thu, 28 Jun 2007

Reject Software Engineering?


Eric Wise asks, and I mostly agree.

--titus

posted at: 08:03 | path: /jun-07 | 0 comments

Tags:


Fri, 22 Jun 2007

Faculty programming contest?


Does anyone know if there are any faculty programming contests out there?

It'd be fun, and I can't imagine that the competition would be as tough as the student programming contests probably are ;).

--titus

posted at: 16:03 | path: /jun-07 | 8 comments

Tags: ,