| Author: | C Titus Brown |
|---|---|
| Date: | June 18, 2007 |
Welcome! You have stumbled upon the class handouts for a course I taught at Lawrence Livermore National Lab, June 12-June 14, 2007.
These notes are intended to accompany my lecture, which was a demonstration of a variety of "intermediate" Python features and packages. Because the demonstration was interactive, these notes are not complete notes of what went on in the course. (Sorry about that; they have been updated from my actual handouts to be more complete...)
However, all 70 pages are free to view and print, so enjoy.
All errors are, of course, my own. Note that almost all of the examples starting with '>>>' are doctests, so you can take the source and run doctest on it to make sure I'm being honest. But do me a favor and run the doctests with Python 2.5 ;).
Note that Day 1 of the course ran through the end of "Testing Your Software"; Day 2 ran through the end of "Online Resources for Python"; and Day 3 finished it off.
Example code (mostly from the C extension sections) is available here; see the README for more information.
Extracts from The Zen of Python by Tim Peters:
- Beautiful is better than ugly.
- Explicit is better than implicit.
- Simple is better than complex.
- Readability counts.
(The whole Zen is worth reading...)
The first step in programming is getting stuff to work at all.
The next step in programming is getting stuff to work regularly.
The step after that is reusing code and designing for reuse.
Somewhere in there you will start writing idiomatic Python.
Idiomatic Python is what you write when the only thing you're struggling with is the right way to solve your problem, and you're not struggling with the programming language or some weird library error or a nasty data retrieval issue or something else extraneous to your real problem. The idioms you prefer may differ from the idioms I prefer, but with Python there will be a fair amount of overlap, because there is usually at most one obvious way to do every task. (A caveat: "obvious" is unfortunately the eye of the beholder, to some extent.)
For example, let's consider the right way to keep track of the item number while iterating over a list. So, given a list z,
>>> z = [ 'a', 'b', 'c', 'd' ]
let's try printing out each item along with its index.
You could use a while loop:
>>> i = 0 >>> while i < len(z): ... print i, z[i] ... i += 1 0 a 1 b 2 c 3 d
or a for loop:
>>> for i in range(0, len(z)): ... print i, z[i] 0 a 1 b 2 c 3 d
but I think the clearest option is to use enumerate:
>>> for i, item in enumerate(z): ... print i, item 0 a 1 b 2 c 3 d
Why is this the clearest option? Well, look at the ZenOfPython extract above: it's explicit (we used enumerate); it's simple; it's readable; and I would even argue that it's prettier than the while loop, if not exactly "beatiful".
Python provides this kind of simplicity in as many places as possible, too. Consider file handles; did you know that they were iterable?
>>> for line in file('data/listfile.txt'):
... print line.rstrip()
a
b
c
d
Where Python really shines is that this kind of simple idiom -- in this case, iterables -- is very very easy not only to use but to construct in your own code. This will make your own code much more reusable, while improving code readability dramatically. And that's the sort of benefit you will get from writing idiomatic Python.
I'm sure you're all familiar with tuples, lists, and dictionaries, right? Let's do a quick tour nonetheless.
'tuples' are all over the place. For example, this code for swapping two numbers implicitly uses tuples:
>>> a = 5 >>> b = 6 >>> a, b = b, a >>> print a == 6, b == 5 True True
That's about all I have to say about tuples.
I use lists and dictionaries all the time. They're the two greatest inventions of mankind, at least as far as Python goes. With lists, it's just easy to keep track of stuff:
>>> x = [] >>> x.append(5) >>> x.extend([6, 7, 8]) >>> x [5, 6, 7, 8] >>> x.reverse() >>> x [8, 7, 6, 5]
It's also easy to sort. Consider this set of data:
>>> y = [ ('IBM', 5), ('Zil', 3), ('DEC', 18) ]
The sort method will run cmp on each of the tuples, which sort on the first element of each tuple:
>>> y.sort()
>>> y
[('DEC', 18), ('IBM', 5), ('Zil', 3)]
Often it's handy to sort tuples on a different tuple element, and there are several ways to do that. I prefer to provide my own sort method:
>>> def sort_on_second(a, b): ... return cmp(a[1], b[1])
>>> y.sort(sort_on_second)
>>> y
[('Zil', 3), ('IBM', 5), ('DEC', 18)]
Note that here I'm using the builtin cmp method (which is what sort uses by default: y.sort() is equivalent to y.sort(cmp)) to do the comparison of the second part of the tuple.
This kind of function is really handy for sorting dictionaries by value, as I'll show you below.
(For a more in-depth discussion of sorting options, check out the Sorting HowTo.)
On to dictionaries!
Your basic dictionary is just a hash table that takes keys and returns values:
>>> d = {}
>>> d['a'] = 5
>>> d['b'] = 4
>>> d['c'] = 18
>>> d
{'a': 5, 'c': 18, 'b': 4}
>>> d['a']
5
You can also initialize a dictionary using the dict type to create a dict object:
>>> e = dict(a=5, b=4, c=18)
>>> e
{'a': 5, 'c': 18, 'b': 4}
Dictionaries have a few really neat features that I use pretty frequently. For example, let's collect (key, value) pairs where we potentially have multiple values for each key. That is, given a file containing this data,
a 5 b 6 d 7 a 2 c 1
suppose we want to keep all the values? If we just did it the simple way,
>>> d = {}
>>> for line in file('data/keyvalue.txt'):
... key, value = line.split()
... d[key] = int(value)
we would lose all but the last value for each key:
>>> d
{'a': 2, 'c': 1, 'b': 6, 'd': 7}
You can collect all the values by using get:
>>> d = {}
>>> for line in file('data/keyvalue.txt'):
... key, value = line.split()
... l = d.get(key, [])
... l.append(int(value))
... d[key] = l
>>> d
{'a': [5, 2], 'c': [1], 'b': [6], 'd': [7]}
The key point here is that d.get(k, default) is equivalent to d[k] if d[k] already exists; otherwise, it returns default. So, the first time each key is used, l is set to an empty list; the value is appended to this list, and then the value is set for that key.
(There are tons of little tricks like the ones above, but these are the ones I use the most; see the Python Cookbook for an endless supply!)
Now let's try combining some of the sorting stuff above with dictionaries. This time, our contrived problem is that we'd like to sort the keys in the dictionary d that we just loaded, but rather than sorting by key we want to sort by the sum of the values for each key.
First, let's define a sort function:
>>> def sort_by_sum_value(a, b): ... sum_a = sum(a[1]) ... sum_b = sum(b[1]) ... return cmp(sum_a, sum_b)
Now apply it to the dictionary items:
>>> items = d.items()
>>> items
[('a', [5, 2]), ('c', [1]), ('b', [6]), ('d', [7])]
>>> items.sort(sort_by_sum_value)
>>> items
[('c', [1]), ('b', [6]), ('a', [5, 2]), ('d', [7])]
and voila, you have your list of keys sorted by summed values!
As I said, there are tons and tons of cute little tricks that you can do with dictionaries. I think they're incredibly powerful.
List comprehensions are neat little constructs that will shorten your lines of code considerably. Here's an example that constructs a list of squares between 0 and 4:
>>> z = [ i**2 for i in range(0, 5) ] >>> z [0, 1, 4, 9, 16]
You can also add in conditionals, like requiring only even numbers:
>>> z = [ i**2 for i in range(0, 10) if i % 2 == 0 ] >>> z [0, 4, 16, 36, 64]
The general form is
[ expression for var in list if conditional ]
so pretty much anything you want can go in expression and conditional.
I find list comprehensions to be very useful for both file parsing and for simple math. Consider a file containing data and comments:
# this is a comment or a header 1 # another comment 2
where you want to read in the numbers only:
>>> data = [ int(x) for x in open('data/commented-data.txt') if x[0] != '#' ]
>>> data
[1, 2]
This is short, simple, and very explicit!
For simple math, suppose you need to calculate the average and stddev of some numbers. Just use a list comprehension:
>>> import math >>> data = [ 1, 2, 3, 4, 5 ] >>> average = sum(data) / float(len(data)) >>> stddev = sum([ (x - average)**2 for x in data ]) / float(len(data)) >>> stddev = math.sqrt(stddev) >>> print average, '+/-', stddev 3.0 +/- 1.41421356237
Oh, and one rule of thumb: if your list comprehension is longer than one line, change it to a for loop; it will be easier to read, and easier to understand.
Most people should be pretty familiar with basic classes.
>>> class A: ... def __init__(self, item): ... self.item = item ... def hello(self): ... print 'hello,', self.item
>>> x = A('world')
>>> x.hello()
hello, world
There are a bunch of neat things you can do with classes, but one of the neatest is building new types that can be used with standard Python list/dictionary idioms.
For example, let's consider a basic binning class.
>>> class Binner: ... def __init__(self, binwidth, binmax): ... self.binwidth, self.binmax = binwidth, binmax ... nbins = int(binmax / float(binwidth) + 1) ... self.bins = [0] * nbins ... ... def add(self, value): ... bin = value / self.binwidth ... self.bins[bin] += 1
This behaves as you'd expect:
>>> binner = Binner(5, 20) >>> for i in range(0,20): ... binner.add(i) >>> binner.bins [5, 5, 5, 5, 0]
...but wouldn't it be nice to be able to write this?
for i in range(0, len(binner)): print i, binner[i]
or even this?
for i, bin in enumerate(binner): print i, bin
This is actually quite easy, if you make the Binner class look like a list by adding two special functions:
>>> class Binner: ... def __init__(self, binwidth, binmax): ... self.binwidth, self.binmax = binwidth, binmax ... nbins = int(binmax / float(binwidth) + 1) ... self.bins = [0] * nbins ... ... def add(self, value): ... bin = value / self.binwidth ... self.bins[bin] += 1 ... ... def __getitem__(self, index): ... return self.bins[index] ... ... def __len__(self): ... return len(self.bins)
>>> binner = Binner(5, 20) >>> for i in range(0,20): ... binner.add(i)
and now we can treat Binner objects as normal lists:
>>> for i in range(0, len(binner)): ... print i, binner[i] 0 5 1 5 2 5 3 5 4 0
>>> for n in binner: ... print n 5 5 5 5 0
In the case of len(binner), Python knows to use the special method __len__, and likewise binner[i] just calls __getitem__(i).
The second case involves a bit more implicit magic. Here, Python figures out that Binner can act like a list and simply calls the right functions to retrieve the information.
Note that making your own read-only dictionaries is pretty simple, too: just provide the __getitem__ function, which is called for non-integer values as well:
>>> class SillyDict: ... def __getitem__(self, key): ... print 'key is', key ... return key >>> sd = SillyDict() >>> x = sd['hello, world'] key is hello, world >>> x 'hello, world'
You can also write your own mutable types, e.g.
>>> class SillyDict: ... def __setitem__(self, key, value): ... print 'setting', key, 'to', value >>> sd = SillyDict() >>> sd[5] = 'world' setting 5 to world
but I have found this to be less useful in my own code, where I'm usually writing special objects like the Binner type above: I prefer to specify my own methods for putting information into the object type, because it reminds me that it is not a generic Python list or dictionary. However, the use of __getitem__ (and some of the iterator and generator features I discuss below) can make code much more readable, and so I use them whenever I think the meaning will be unambiguous. For example, with the Binner type, the purpose of __getitem__ and __len__ is not very ambiguous, while the purpose of a __setitem__ function (to support binner[x] = y) would be unclear.
Overall, the creation of your own custom list and dict types is one way to make reusable code that will fit nicely into Python's natural idioms. In turn, this can make your code look much simpler and feel much cleaner. The risk, of course, is that you will also make your code harder to understand and (if you're not careful) harder to debug. Mediating between these options is mostly a matter of experience.
Iterators are another built-in Python feature; unlike the list and dict types we discussed above, an iterator isn't really a type, but a protocol. This just means that Python agrees to respect anything that supports a particular set of methods as if it were an iterator. (These protocols appear everywhere in Python; we were taking advantage of the mapping and sequence protocols above, when we defined __getitem__ and __len__, respectively.)
Iterators are more general versions of the sequence protocol; here's an example:
>>> class SillyIter: ... i = 0 ... n = 5 ... def __iter__(self): ... return self ... def next(self): ... self.i += 1 ... if self.i > self.n: ... raise StopIteration ... return self.i
>>> si = SillyIter() >>> for i in si: ... print i 1 2 3 4 5
Here, __iter__ just returns self, an object that has the function next(), which (when called) either returns a value or raises a StopIteration exception.
We've actually already met several iterators in disguise; in particular, enumerate is an iterator. To drive home the point, here's a simple reimplementation of enumerate:
>>> class my_enumerate: ... def __init__(self, some_iter): ... self.some_iter = iter(some_iter) ... self.count = -1 ... ... def __iter__(self): ... return self ... ... def next(self): ... val = self.some_iter.next() ... self.count += 1 ... return self.count, val >>> for n, val in my_enumerate(['a', 'b', 'c']): ... print n, val 0 a 1 b 2 c
You can also iterate through an iterator the "old-fashioned" way:
>>> some_iter = iter(['a', 'b', 'c']) >>> while 1: ... try: ... print some_iter.next() ... except StopIteration: ... break a b c
but that would be silly in most situations! I use this if I just want to get the first value or two from an iterator.
With iterators, one thing to watch out for is the return of self from the __iter__ function. You can all too easily write an iterator that isn't as re-usable as you think it is. For example, suppose you had the following class:
>>> class MyTrickyIter: ... def __init__(self, thelist): ... self.thelist = thelist ... self.index = -1 ... ... def __iter__(self): ... return self ... ... def next(self): ... self.index += 1 ... if self.index < len(self.thelist): ... return self.thelist[self.index] ... raise StopIteration
This works just like you'd expect as long as you create a new object each time:
>>> for i in MyTrickyIter(['a', 'b']): ... for j in MyTrickyIter(['a', 'b']): ... print i, j a a a b b a b b
but it will break if you create the object just once:
>>> mi = MyTrickyIter(['a', 'b']) >>> for i in mi: ... for j in mi: ... print i, j a b
because self.index is incremented in each loop.
Generators are a Python implementation of coroutines. Essentially, they're functions that let you suspend execution and return a result:
>>> def g(): ... for i in range(0, 5): ... yield i**2 >>> for i in g(): ... print i 0 1 4 9 16
You could do this with a list just as easily, of course:
>>> def h(): ... return [ x ** 2 for x in range(0, 5) ] >>> for i in h(): ... print i 0 1 4 9 16
But you can do things with generators that you couldn't do with finite lists. Consider two full implementation of Eratosthenes' Sieve for finding prime numbers, below.
First, let's define some boilerplate code that can be used by either implementation:
>>> def divides(primes, n): ... for trial in primes: ... if n % trial == 0: return True ... return False
Now, let's write a simple sieve with a generator:
>>> def prime_sieve(): ... p, current = [], 1 ... while 1: ... current += 1 ... if not divides(p, current): # if any previous primes divide, cancel ... p.append(current) # this is prime! save & return ... yield current
This implementation will find (within the limitations of Python's math functions) all prime numbers; the programmer has to stop it herself:
>>> for i in prime_sieve(): ... print i ... if i > 10: ... break 2 3 5 7 11
So, here we're using a generator to implement the generation of an infinite series with a single function definition. To do the equivalent with an iterator would require a class, so that the object instance can hold the variables:
>>> class iterator_sieve: ... def __init__(self): ... self.p, self.current = [], 1 ... def __iter__(self): ... return self ... def next(self): ... while 1: ... self.current = self.current + 1 ... if not divides(self.p, self.current): ... self.p.append(self.current) ... return self.current
>>> for i in iterator_sieve(): ... print i ... if i > 10: ... break 2 3 5 7 11
It is also much easier to write routines like enumerate as a generator than as an iterator:
>>> def gen_enumerate(some_iter): ... count = 0 ... for val in some_iter: ... yield count, val ... count += 1
>>> for n, val in gen_enumerate(['a', 'b', 'c']): ... print n, val 0 a 1 b 2 c
Abstruse note: we don't even have to catch StopIteration here, because the for loop simply ends when some_iter is done!
One of the most underused keywords in Python is assert. Assert is pretty simple: it takes a boolean, and if the boolean evaluates to False, it fails (by raising an AssertionError exception). assert True is a no-op.
>>> assert True >>> assert False Traceback (most recent call last): ... AssertionError
You can also put an optional message in:
>>> assert False, "you can't do that here!" Traceback (most recent call last): ... AssertionError: you can't do that here!
assert is very, very useful for making sure that code is behaving according to your expectations during development. Worried that you're getting an empty list? assert len(x). Want to make sure that a particular return value is not None? assert retval is not None.
Also note that 'assert' statements are removed from optimized code, so only use them to conditions related to actual development, and make sure that the statement you're evaluating has no side effects. For example,
>>> a = 1 >>> def check_something(): ... global a ... a = 5 ... return True >>> assert check_something()
will behave differently when run under optimization than when run without optimization, because the assert line will be removed completely from optimized code.
If you need to raise an exception in production code, see below. The quickest and dirtiest way is to just "raise Exception", but that's kind of non-specific ;).
Use of common Python idioms -- both in your python code and for your new types -- leads to short, sweet programs.
Python is really the first programming language in which I started re-using code significantly. In part, this is because it is rather easy to compartmentalize functions and classes in Python. Something else that Python makes relatively easy is building testing into your program structure. Combined, reusability and testing can have a huge effect on maintenance.
It's difficult to come up with any hard and fast rules for programming for reusability, but my main rules of thumb are: don't plan too much, and don't hesitate to refactor your code. [1].
In any project, you will write code that you want to re-use in a slightly different context. It will often be easiest to cut and paste this code rather than to copy the module it's in -- but try to resist this temptation a bit, and see if you can make the code work for both uses, and then use it in both places.
| [1] | If you haven't read Martin Fowler's Refactoring, do so -- it describes how to incrementally make your code better. I'll discuss it some more in the context of testing, below. |
The organization of your code source files can help or hurt you with code re-use.
Most people start their Python programming out by putting everything in a script:
calc-squares.py:
#! /usr/bin/env python
for i in range(0, 10):
print i**2
This is great for experimenting, but you can't re-use this code at all!
(UNIX folk: note the use of #! /usr/bin/env python, which tells UNIX to execute this script using whatever python program is first in your path. This is more portable than putting #! /usr/local/bin/python or #! /usr/bin/python in your code, because not everyone puts python in the same place.)
Back to reuse. What about this?
calc-squares.py:
#! /usr/bin/env python
def squares(start, stop):
for i in range(start, stop):
print i**2
squares(0, 10)
I think that's a bit better for re-use -- you've made squares flexible and re-usable -- but there are two mechanistic problems. First, it's named calc-squares.py, which means it can't readily be imported. (Import filenames have to be valid Python names, of course!) And, second, were it importable, it would execute squares(0, 10) on import - hardly what you want!
To fix the first, just change the name:
calc_squares.py:
#! /usr/bin/env python
def squares(start, stop):
for i in range(start, stop):
print i**2
squares(0, 10)
Good, but now if you do import calc_squares, the squares(0, 10) code will still get run! There are a couple of ways to deal with this. The first is to look at the module name: if it's calc_squares, then the module is being imported, while if it's __main__, then the module is being run as a script:
calc_squares.py:
#! /usr/bin/env python
def squares(start, stop):
for i in range(start, stop):
print i**2
if __name__ == '__main__':
squares(0, 10)
Now, if you run calc_squares.py directly, it will run squares(0, 10); if you import it, it will simply define the squares function and leave it at that. This is probably the most standard way of doing it.
I actually prefer a different technique, because of my fondness for testing. (I also think this technique lends itself to reusability, though.) I would actually write two files:
squares.py:
def squares(start, stop):
for i in range(start, stop):
print i**2
if __name__ == `__main__`:
# ...run automated tests...
calc-squares:
#! /usr/bin/env python
import squares
squares.squares(0, 10)
A few notes -- first, this is eminently reusable code, because squares.py is completely separate from the context-specific call. Second, you can look at the directory listing in an instant and see that squares.py is probably a library, while calc-squares must be a script, because the latter cannot be imported. Third, you can add automated tests to squares.py (as described below), and run them simply by running python squares.py. Fourth, you can add script-specific code such as command-line argument handling to the script, and keep it separate from your data handling and algorithm code.
A Python package is a directory full of Python modules containing a special file, __init__.py, that tells Python that the directory is a package. Packages are for collections of library code that are too big to fit into single files, or that have some logical substructure (e.g. a central library along with various utility functions that all interact with the central library).
For an example, look at this directory tree:
package/
__init__.py -- contains functions a(), b()
other.py -- contains function c()
subdir/
__init__.py -- contains function d()
From this directory tree, you would be able to access the functions like so:
import package package.a() package.b() import package.other package.other.c() import package.subdir package.subdir.d()
Note that __init__.py is just another Python file; there's nothing special about it except for the name, which tells Python that the directory is a package directory. __init__.py is the only code executed on import, so if you want names and symbols from other modules to be accessible at the package top level, you have to import or create them in __init__.py.
There are two ways to use packages: you can treat them as a convenient code organization technique, and make most of the functions or classes available at the top level; or you can use them as a library hierarchy. In the first case you would make all of the names above available at the top level:
package/__init__.py: from other import c from subdir import d ...
which would let you do this:
import package package.a() package.b() package.c() package.d()
That is, the names of the functions would all be immediately available at the top level of the package, but the implementations would be spread out among the different files and directories. I personally prefer this because I don't have to remember as much ;). The down side is that everything gets imported all at once, which (especially for large bodies of code) may be slow and memory intensive if you only need a few of the functions.
Alternatively, if you wanted to keep the library hierarchy, just leave out the top-level imports. The advantage here is that you only import the names you need; however, you need to remember more.
Some people are fond of package trees, but I've found that hierarchies of packages more than two deep are annoying to develop on: you spend a lot of your time browsing around between directories, trying to figure out exactly which function you need to use and what it's named. (Your mileage may vary.) I think this is one of the main reasons why the Python stdlib looks so big, because most of the packages are top-level.
One final note: you can restrict what objects are exported from a module or package by listing the names in the __all__ variable. So, if you had a module some_mod.py that contained this code:
some_mod.py:
__all__ = ['fn1']
def fn1(...):
...
def fn2(...):
...
then only 'some_mod.fn1()' would be available on import. This is a good way to cut down on "namespace pollution" -- the presence of "private" objects and code in imported modules -- which in turn makes introspection useful.
You may have noticed that a lot of Python code looks pretty similar -- this is because there's an "official" style guide for Python, called PEP 8. It's worth a quick skim, and an occasional deeper read for some sections.
Here are a few tips that will make your code look internally consistent, if you don't already have a coding style of your own:
use four spaces (NOT a tab) for each indentation level;
- use lowercase, _-separated names for module and function names, e.g.
my_module;
use CapsWord style to name classes, e.g. MySpecialClass;
- use '_'-prefixed names to indicate a "private" variable that should
not be used outside this module, , e.g. _some_private_variable;
Docstrings are strings of text attached to Python objects like modules, classes, and methods/functions. They can be used to provide human-readable help when building a library of code. "Good" docstring coding is used to provide additional information about functionality beyond what can be discovered automatically by introspection; compare
def is_prime(x):
"""
is_prime(x) -> true/false. Determines whether or not x is prime,
and return true or false.
"""
versus
def is_prime(x):
"""
Returns true if x is prime, false otherwise.
is_prime() uses the Bernoulli-Schmidt formalism for figuring out
if x is prime. Because the BS form is stochastic and hysteretic,
multiple calls to this function will be increasingly accurate.
"""
The top example is good (documentation is good!), but the bottom example is better, for a few reasons. First, it is not redundant: the arguments to is_prime are discoverable by introspection and don't need to be specified. Second, it's summarizable: the first line stands on its own, and people who are interested in more detail can read on. This enables certain document extraction tools to do a better job.
For more on docstrings, see PEP 257.
There are three levels at which data can be shared between Python code: module globals, class attributes, and object attributes. You can also sneak data into functions by dynamically defining a function within another scope, and/or binding them to keyword arguments.
Just to make sure we're clear on scoping, here are a few simple examples. In this first example, f() gets x from the module namespace.
>>> x = 1 >>> def f(): ... print x >>> f() 1
In this second example, f() overrides x, but only within the namespace in f().
>>> x = 1 >>> def f(): ... x = 2 ... print x >>> f() 2 >>> print x 1
In this third example, g() overrides x, and h() obtains x from within g(), because h() was defined within g():
>>> x = 1
>>> def outer(): ... x = 2 ... ... def inner(): ... print x ... ... return inner
>>> inner = outer() >>> inner() 2
In all cases, without a global declaration, assignments will simply create a new local variable of that name, and not modify the value in any other scope:
>>> x = 1 >>> def outer(): ... x = 2 ... ... def inner(): ... x = 3 ... ... inner() ... ... print x >>> outer() 2
However, with a global definition, the outermost scope is used:
>>> x = 1 >>> def outer(): ... x = 2 ... ... def inner(): ... global x ... x = 3 ... ... inner() ... ... print x >>> outer() 2 >>> print x 3
I generally suggest avoiding scope trickery as much as possible, in the interests of readability. There are two common patterns that I use when I have to deal with scope issues.
First, module globals are sometimes necessary. For one such case, imagine that you have a centralized resource that you must initialize precisely once, and you have a number of functions that depend on that resource. Then you can use a module global to keep track of the initialization state. Here's a (contrived!) example for a random number generator that initializes the random number seed precisely once:
_initialized = False
def init():
global _initialized
if not _initialized:
import time
random.seed(time.time())
_initialized = True
def randint(start, stop):
init()
...
This code ensures that the random number seed is initialized only once by making use of the _initialized module global. A few points, however:
- this code is not threadsafe. If it was really important that the resource be initialized precisely once, you'd need to use thread locking. Otherwise two functions could call randint() at the same time and both could get past the if statement.
- the module global code is very isolated and its use is very clear. Generally I recommend having only one or two functions that access the module global, so that if I need to change its use I don't have to understand a lot of code.
The other "scope trickery" that I sometimes engage in is passing data into dynamically generated functions. Consider a situation where you have to use a callback API: that is, someone has given you a library function that will call your own code in certain situations. For our example, let's look at the re.sub function that comes with Python, which takes a callback function to apply to each match.
Here's a callback function that uppercases words:
>>> def replace(m): ... match = m.group() ... print 'replace is processing:', match ... return match.upper() >>> s = "some string"
>>> import re
>>> print re.sub('\\S+', replace, s)
replace is processing: some
replace is processing: string
SOME STRING
What's happening here is that the replace function is called each time the regular expression '\S+' (a set of non-whitespace characters) is matched. The matching substring is replaced by whatever the function returns.
Now let's imagine a situation where we want to pass information into replace; for example, we want to process only words that match in a dictionary. (I told you it was contrived!) We could simply rely on scoping:
>>> d = { 'some' : True, 'string' : False }
>>> def replace(m):
... match = m.group()
... if match in d and d[match]:
... return match.upper()
... return match
>>> print re.sub('\\S+', replace, s)
SOME string
but I would argue against it on the grounds of readability: passing information implicitly between scopes is bad. (At this point advanced Pythoneers might sneer at me, because scoping is natural to Python, but nuts to them: readability and transparency is also very important.) You could also do it this way:
>>> d = { 'some' : True, 'string' : False }
>>> def replace(m, replace_dict=d): # <-- explicit declaration
... match = m.group()
... if match in replace_dict and replace_dict[match]:
... return match.upper()
... return match
>>> print re.sub('\\S+', replace, s)
SOME string
The idea is to use keyword arguments on the function to pass in required information, thus making the information passing explicit.
I started discussing scope in the context of sharing data, but we got a bit sidetracked from data sharing. Let's get back to that now.
The key to thinking about data sharing in the context of code reuse is to think about how that data will be used.
If you use a module global, then any code in that module has access to that global.
If you use a class attribute, then any object of that class type (including inherited classes) shares that data.
And, if you use an object attribute, then every object of that class type will have its own version of that data.
How do you choose which one to use? My ground rule is to minimize the use of more widely shared data. If it's possible to use an object variable, do so; otherwise, use either a module or class attribute. (In practice I almost never use class attributes, and infrequently use module globals.)
Something that has been implicit in the discussion of scope and data sharing, above, is the order in which module code is executed. There shouldn't be any surprises here if you've been using Python for a while, so I'll be brief: in general, the code at the top level of a module is executed at first import, and all other code is executed in the order you specify when you start calling functions or methods.
Note that because the top level of a module is executed precisely once, at first import, the following code prints "hello, world" only once:
mod_a.py:
def f():
print 'hello, world'
f()
mod_b.py:
import mod_a
The reload function will reload the module and force re-execution at the top level:
reload(sys.modules['mod_a'])
It is also worth noting that the module name is bound to the local namespace prior to the execution of the code in the module, so not all symbols in the module are immediately available. This really only impacts you if you have interdependencies between modules: for example, this will work if mod_a is imported before mod_b:
mod_a.py: import mod_b mod_b.py: import mod_a
while this will not:
mod_a.py: import mod_b x = 5 mod_b.py: import mod_a y = mod_a.x
To see why, let's put in some print statements:
mod_a.py: print 'at top of mod_a' import mod_b print 'mod_a: defining x' x = 5 mod_b.py: print 'at top of mod_b' import mod_a print 'mod_b: defining y' y = mod_a.x
Now try import mod_a and import mod_b, each time in a new interpreter:
>> import mod_a
at top of mod_a
at top of mod_b
mod_b: defining y
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "mod_a.py", line 2, in <module>
import mod_b
File "mod_b.py", line 4, in <module>
y = mod_a.x
AttributeError: 'module' object has no attribute 'x'
>> import mod_b
at top of mod_b
at top of mod_a
mod_a: defining x
mod_b: defining y
So, you've got your re-usable code nicely defined in modules, and now you want to ... use it. How can you import code from multiple locations?
The simplest way is to set the PYTHONPATH environment variable to contain a list of directories from which you want to import code; e.g. in UNIX bash,
% export PYTHONPATH=/path/to/directory/one:/path/to/directory/two
or in csh,
% setenv PYTHONPATH /path/to/directory/one:/path/to/directory/two
Under Windows,
> set PYTHONPATH directory1;directory2
should work.
However, setting the PYTHONPATH explicitly can make your code less movable in practice, because you will forget (and fail to document) the modules and packages that your code depends on. I prefer to modify sys.path directly:
import sys sys.path.insert(0, '/path/to/directory/one') sys.path.insert(0, '/path/to/directory/two')
which has the advantage that you are explicitly specifying the location of packages that you depend upon in the dependent code.
Note also that you can put modules and packages in zip files and Python will be able to import directly from the zip file; just place the path to the zip file in either sys.path or your PYTHONPATH.
Now, I tend to organize my projects into several directories, with a bin/ directory that contains my scripts, and a lib/ directory that contains modules and packages. If I want to to deploy this code in multiple locations, I can't rely on inserting absolute paths into sys.path; instead, I want to use relative paths. Here's the trick I use
In my script directory, I write a file _mypath.py.
_mypath.py:
import os, sys
thisdir = os.path.dirname(__file__)
libdir = os.path.join(thisdir, '../relative/path/to/lib/from/bin')
if libdir not in sys.path:
sys.path.insert(0, libdir)
Now, in each script I put import _mypath at the top of the script. When running scripts, Python automatically enters the script's directory into sys.path, so the script can import _mypath. Then _mypath uses the special attribute __file__ to calculate its own location, from which it can calculate the absolute path to the library directory and insert the library directory into sys.path.
While developing code, it's easy to simply work out of the development directory. However, if you want to pass the code onto others as a finished module, or provide it to systems admins, you might want to consider writing a setup.py file that can be used to install your code in a more standard way. setup.py lets you use distutils to install the software by running
python setup.py install
Writing a setup.py is simple, especially if your package is pure Python and doesn't include any extension files. A setup.py file for a pure Python install looks like this:
from distutils.core import setup
setup(name='your_package_name',
py_modules = ['module1', 'module2']
packages = ['package1', 'package2']
scripts = ['script1', 'script2'])
One this script is written, just drop it into the top-level directory and type python setup.py build. This will make sure that distutils can find all the files.
Once your setup.py works for building, you can package up the entire directory with tar or zip and anyone should be able to install it by unpacking the package and typing
% python setup.py install
This will copy the packages and modules into Python's site-packages directory, and install the scripts into Python's script directory.
A somewhat newer (and better) way of distributing Python software is to use easy_install, a system developed by Phillip Eby as part of the setuptools package. Many of the capabilities of easy_install/setuptools are probably unnecessary for scientific Python developers (although it's an excellent way to install Python packages from other sources), so I will focus on three capabilities that I think are most useful for "in-house" development: versioning, user installs, and binary eggs.
First, install easy_install/setuptools. You can do this by downloading
http://peak.telecommunity.com/dist/ez_setup.py
and running python ez_setup.py. (If you can't do this as the superuser, see the note below about user installs.) Once you've installed setuptools, you should be able to run the script easy_install.
The first thing this lets you do is easily install any software that is distutils-compatible. You can do this from a number of sources: from an unpackaged directory (as with python setup.py install); from a tar or zip file; from the project's URL or Web page; from an egg (see below); or from PyPI, the Python Package Index (see http://cheeseshop.python.org/pypi/).
Let's try installing nose, a unit test discovery package we'll be looking at in the testing section (below). Type:
easy_install --install-dir=~/.packages nose
This will go to the Python Package Index, find the URL for nose, download it, and install it in your ~/.packages directory. We're specifying an install-dir so that you can install it for your use only; if you were the superuser, you could install it for everyone by omitting '--install-dir'.
(Note that you need to add ~/.packages to your PATH and your PYTHONPATH, something I've already done for you.)
So, now, you can go do 'import nose' and it will work. Neat, eh? Moreover, the nose-related scripts (nosetests, in this case) have been installed for your use as well.
You can also install specific versions of software; right now, the latest version of nose is 0.9.3, but if you wanted 0.9.2, you could specify easy_install nose==0.9.2 and it would do its best to find it.
This leads to the next setuptools feature of note, pkg_resource.require. pkg_resources.require lets you specify that certain packages must be installed. Let's try it out by requiring that CherryPy 3.0 or later is installed:
>> import pkg_resources
>> pkg_resources.require('CherryPy >= 3.0')
Traceback (most recent call last):
...
DistributionNotFound: CherryPy >= 3.0
OK, so that failed... but now let's install CherryPy:
% easy_install --install-dir=~/.packages CherryPy
Now the require will work:
>> pkg_resources.require('CherryPy >= 3.0')
>> import CherryPy
This version requirement capability is quite powerful, because it lets you specify exactly the versions of the software you need for your own code to work. And, if you need multiple versions of something installed, setuptools lets you do that, too -- see the --multi-version flag for more information. While you still can't use different versions of the same package in the same program, at least you can have multiple versions of the same package installed!
Throughout this, we've been using another great feature of setuptools: user installs. By specifying the --install-dir, you can install most Python packages for yourself, which lets you take advantage of easy_install's capabilities without being the superuser on your development machine.
This brings us to the last feature of setuptools that I want to mention: eggs, and in particular binary eggs. We'll explore binary eggs later; for now let me just say that easy_install makes it possible for you to package up multiple binary versions of your software (with extension modules) so that people don't have to compile it themselves. This is an invaluable and somewhat underutilized feature of easy_install, but it can make life much easier for your users.
"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." -- Brian W. Kernighan.
Everyone tests their software to some extent, if only by running it and trying it out (technically known as "smoke testing"). Most programmers do a certain amount of exploratory testing, which involves running through various functional paths in your code and seeing if they work.
Systematic testing, however, is a different matter. Systematic testing simply cannot be done properly without a certain (large!) amount of automation, because every change to the software means that the software needs to be tested all over again.
Below, I will introduce you to some lower level automated testing concepts, and show you how to use built-in Python constructs to start writing tests.
There are several types of tests that are particularly useful to research programmers. Unit tests are tests for fairly small and specific units of functionality. Functional tests test entire functional paths through your code. Regression tests make sure that (within the resolution of your records) your program's output has not changed.
All three types of tests are necessary in different ways.
Regression tests tell you when unexpected changes in behavior occur, and can reassure you that your basic data processing is still working. For scientists, this is particularly important if you are trying to link past research results to new research results: if you can no longer replicate your original results with your updated code, then you must regard your code with suspicion, unless the changes are intentional.
By contrast, both unit and functional tests tend to be expectation based. By this I mean that you use the tests to lay out what behavior you expect from your code, and write your tests so that they assert that those expectations are met.
The difference between unit and functional tests is blurry in most actual implementations; unit tests tend to be much shorter and require less setup and teardown, while functional tests can be quite long. I like Kumar McMillan's distinction: functional tests tell you when your code is broken, while unit tests tell you where your code is broken. That is, because of the finer granularity of unit tests, a broken unit test can identify a particular piece of code as the source of an error, while functional tests merely tell you that a feature is broken.
Let's start by looking at the doctest module. If you've been following along, you will be familiar with doctests, because I've been using them throughout this text! A doctest links code and behavior explicitly in a nice documentation format. Here's an example:
>>> print 'hello, world' hello, world
When doctest sees this in a docstring or in a file, it knows that it should execute the code after the '>>>' and compare the actual output of the code to the strings immediately following the '>>>' line.
To execute doctests, you can use the doctest API that comes with Python: just type:
import doctest doctest.testfile(textfile)
or
import doctest doctest.testmod(modulefile)
The doctest docs contain complete documentation for the module, but in general there are only a few things you need to know.
First, for multi-line entries, use '...' instead of '>>>':
>>> def func(): ... print 'hello, world' >>> func() hello, world
Second, if you need to elide exception code, use '...':
>>> raise Exception("some error occurred")
Traceback (most recent call last):
...
Exception: some error occurred
More generally, you can use '...' to match random output, as long as you you specify a doctest directive:
>>> import random >>> print 'random number:', random.randint(0, 10) # doctest: +ELLIPSIS random number: ...
Third, doctests are terminated with a blank line, so if you explicitly expect a blank line, you need to use a special construct:
>>> print '' <BLANKLINE>
To test out some doctests of your own, try modifying these files and running them with doctest.testfile.
Doctests are useful in a number of ways. They encourage a kind of conversation with the user, in which you (the author) demonstrate how to actually use the code. And, because they're executable, they ensure that your code works as you expect. However, they can also result in quite long docstrings, so I recommend putting long doctests in files separate from the code files. Short doctests can go anywhere -- in module, class, or function docstrings.
If you've heard of automated testing, you've almost certainly heard of unit tests. The idea behind unit tests is that you can constrain the behavior of small units of code to be correct by testing the bejeezus out of them; and, if your smallest code units are broken, then how can code built on top of them be good?
The unittest module comes with Python. It provides a framework for writing and running unit tests that is at least convenient, if not as simple as it could be (see the 'nose' stuff, below, for something that is simpler).
Unit tests are almost always demonstrated with some sort of numerical process, and I will be no different. Here's a simple unit test, using the unittest module:
test_sort.py:
#! /usr/bin/env python
import unittest
class Test(unittest.TestCase):
def test_me(self):
seq = [ 5, 4, 1, 3, 2 ]
seq.sort()
self.assertEqual(seq, [1, 2, 3, 4, 5])
if __name__ == '__main__':
unittest.main()
If you run this, you'll see the following output:
. ---------------------------------------------------------------------- Ran 1 test in 0.000s OK
Here, unittest.main() is running through all of the symbols in the global module namespace and finding out which classes inherit from unittest.TestCase. Then, for each such class, it finds all methods starting with test, and for each one it instantiates a new object and runs the function: so, in this case, just:
Test().test_me()
If any method fails, then the failure output is recorded and presented at the end, but the rest of the test methods are run irrespective.
unittest also includes support for test fixtures, which are functions run before and after each test; the idea is to use them to set up and tear down the test environment. In the code below, setUp creates and shuffles the self.seq sequence, while tearDown deletes it.
test_sort2.py:
#! /usr/bin/env python
import unittest
import random
class Test(unittest.TestCase):
def setUp(self):
self.seq = range(0, 10)
random.shuffle(self.seq)
def tearDown(self):
del self.seq
def test_basic_sort(self):
self.seq.sort()
self.assertEqual(self.seq, range(0, 10))
def test_reverse(self):
self.seq.sort()
self.seq.reverse()
self.assertEqual(self.seq, [9, 8, 7, 6, 5, 4, 3, 2, 1, 0])
def test_destruct(self):
self.seq.sort()
del self.seq[-1]
self.assertEqual(self.seq, range(0, 9))
unittest.main()
In both of these examples, it's important to realize that an entirely new object is created, and the fixtures run, for each test function. This lets you write tests that alter or destroy test data without having to worry about interactions between the code in different tests.
nose is a unit test discovery system that makes writing and organizing unit tests very easy. I've actually written a whole separate article on them, so we should go check that out.
figleaf is a code coverage recording and analysis system that I wrote and maintain. It's published in PyPI, so you can install it with easy_install.
Basic use of figleaf is very easy. If you have a script program.py, rather than typing
% python program.py
to run the script, run
% figleaf program.py
This will transparently and invisibly record coverage to the file '.figleaf' in the current directory. If you run the program several times, the coverage will be aggregated.
To get a coverage report, run 'figleaf2html'. This will produce a subdirectory html/ that you can view with any Web browser; the index.html file will contain a summary of the code coverage, along with links to individual annotated files. In these annotated files, executed lines are colored green, while lines of code that are not executed are colored red. Lines that are not considered lines of code (e.g. docstrings, or comments) are colored black.
My main use for code coverage analysis is in testing (which is why I discuss it in this section!) I record the code coverage for my unit and functional tests, and then examine the output to figure out which files or libraries to focus on testing next. As I discuss below, it is relatively easy to achieve 70-80% code coverage by this method.
When is code coverage most useful? I think it's most useful in the early and middle stages of testing, when you need to track down code that is not touched by your tests. However, 100% code coverage by your tests doesn't guarantee bug free code: this is because figleaf only measures line coverage, not branch coverage. For example, consider this code:
if a.x or a.y: f()
If a.x is True in all your tests, then a.y will never be evaluated -- even though a may not have an attribute y, which would cause an AttributeError (which would in turn be a bug, if not properly caught). Python does not record which subclauses of the if statement are executed, so without analyzing the structure of the program there's no simple way to figure it out.
Here's another buggy example with 100% code coverage:
def f(a):
if a:
a = a.upper()
return a.strip()
s = f("some string")
Here, there's an implicit else after the if statement; the function f() could be rewritten to this:
def f(a):
if a:
a = a.upper()
else:
pass
return a.strip()
s = f("some string")
and the pass statement would show up as "not executed".
So, bottom line: 100% test coverage is necessary for a well-tested program, because code that is not executed by any test at all is simply not being tested. However, 100% test coverage is not sufficient to guarantee that your program is free of bugs, as you can see from some of the examples above.
This testing discussion should help to convince you that not only should you test, but that there are plenty of tools available to help you test in Python. It may even give you some ideas about how to start testing new projects. However, retrofitting an existing project with tests is a different, challenging problem -- where do you start? People are often overwhelmed by the amount of code they've written in the past.
I suggest the following approach.
First, start by writing a test for each bug as they are discovered. The procedure is fairly simple: isolate the cause of the bug; write a test that demonstrates the bug; fix the bug; verify that the test passes. This has several benefits in the short term: you are fixing bugs, you're discovering weak points in your software, you're becoming more familiar with the testing approach, and you can start to think about commonalities in the fixtures necessary to support the tests.
Next, take out some time -- half a day or so -- and write fixtures and functional tests for some small chunk of code; if you can, pick a piece of code that you're planning to clean up or extend. Don't worry about being exhaustive, but just write tests that target the main point of the code that you're working on.
Repeat this a few times. You should start to discover the benefits of testing at this point, as you increasingly prevent bugs from occurring in the code that's covered by the tests. You should also start to get some idea of what fixtures are necessary for your code base.
Now use code coverage analysis to analyze what code your tests cover, and what code isn't covered. At this point you can take a targetted approach and spend some time writing tests aimed directly at uncovered areas of code. There should now be tests that cover 30-50% of your code, at least (it's very easy to attain this level of code coverage!).
Once you've reached this point, you can either decide to focus on increasing your code coverage, or (my recommendation) you can simply continue incrementally constraining your code by writing tests for bugs and new features. Assuming you have a fairly normal code churn, you should get to the point of 70-80% coverage within a few months to a few years (depending on the size of the project!)
This approach is effective because at each stage you get immediate feedback from your efforts, and it's easier to justify to managers than a whole-team effort to add testing. Plus, if you're unfamiliar with testing or with parts of the code base, it gives you time to adjust and adapt your approach to the needs of the particular project.
Two articles that discuss similar approaches in some detail are available online: Strangling Legacy Code, and Growing Your Test Harness. I can also recommend the book Working Effectively with Legacy Code, by Robert Martin.
Starting to do automated testing of your code can lead to immense savings in maintenance and can also increase productivity dramatically. There are a number of reasons why automated testing can help so much, including quick discovery of regressions, increased design awareness due to more interaction with the code, and early detection of simple bugs as well as unwanted epistatic interactions between code modules. The single biggest improvement for me has been the ability to refactor code without worrying as much about breakage. In my personal experience, automated testing is a 5-10x productivity booster when working alone, and it can save multi-person teams from potentially disastrous errors in communication.
Automated testing is not, of course, a silver bullet. There are several common worries.
One worry is that by increasing the total amount of code in a project, you increase both the development time and the potential for bugs and maintenance problems. This is certainly possible, but test code is very different from regular project code: it can be removed much more easily (which can be done whenever the code being tested undergoes revision), and it should be much simpler even if it is in fact bulkier.
Another worry is that too much of a focus on testing will decrease the drive for new functionality, because people will focus more on writing tests than they will on the new code. While this is partly a managerial issues, it is worth pointing out that the process of writing new code will be dramatically faster if you don't have to worry about old code breaking in unexpected ways as you add functionality.
A third worry is that by focusing on automation, you will miss bugs in code that is difficult to automate. There are two considerations here. First, it is possible to automate quite a bit of testing; the decision not to automat a particular test is almost always made because of financial or time considerations rather than technical limitations. And, second, automated testing is simply not a replacement for certain types of manual testing -- in particular, exploratory testing, in which the programmers or users interact with the program, will always turn up new bugs, and is worth doing independent of the automated tests.
How much to test, and what to test, are decisions that need to be made on an individual project basis; there are no hard and fast rules. However, I feel confident in saying that some automated testing will always improve the quality of your code and result in maintenance improvements.
Welcome! This is an introduction, with lots and lots of examples, to the nose unit test discovery & execution framework. If that's not what you want to read, I suggest you hit the Back button now.
The latest version of this document can be found at
http://ivory.idyll.org/articles/nose-intro.html
(Last modified October 2006.)
A unit test is an automated code-level test for a small "unit" of functionality. Unit tests are often designed to test a broad range of the expected functionality, including any weird corner cases and some tests that should not work. They tend to interact minimally with external resources like the disk, the network, and databases; testing code that accesses these resources is usually put under functional tests, regression tests, or integration tests.
(There's lots of discussion on whether unit tests should do things like access external resources, and whether or not they are still "unit" tests if they do. The arguments are fun to read, and I encourage you to read them. I'm going to stick with a fairly pragmatic and broad definition: anything that exercises a small, fairly isolated piece of functionality is a unit test.)
Unit tests are almost always pretty simple, by intent; for example, if you wanted to test an (intentionally naive) regular expression for validating the form of e-mail addresses, your test might look something like this:
EMAIL_REGEXP = r'[\S.]+@[\S.]+' def test_email_regexp(): # a regular e-mail address should match assert re.match(EMAIL_REGEXP, 'test@nowhere.com') # no domain should fail assert not re.match(EMAIL_REGEXP, 'test@')
There are a couple of ways to integrate unit tests into your development style. These include Test Driven Development, where unit tests are written prior to the functionality they're testing; during refactoring, where existing code -- sometimes code without any automated tests to start with -- is retrofitted with unit tests as part of the refactoring process; bug fix testing, where bugs are first pinpointed by a targetted test and then fixed; and straight test enhanced development, where tests are written organically as the code evolves. In the end, I think it matters more that you're writing unit tests than it does exactly how you write them.
For me, the most important part of having unit tests is that they can be run quickly, easily, and without any thought by developers. They serve as executable, enforceable documentation for function and API, and they also serve as an invaluable reminder of bugs you've fixed in the past. As such, they improve my ability to more quickly deliver functional code -- and that's really the bottom line.
It's pretty common to write tests for a library module like so:
def test_me(): # ... many tests, which raise an Exception if they fail ... if __name__ -- '__main__': test_me()
The 'if' statement is a little hook that runs the tests when the module is executed as a script from the command line. This is great, and fulfills the goal of having automated tests that can be run easily. Unfortunately, they cannot be run without thought, which is an amazingly important and oft-overlooked requirement for automated tests! In practice, this means that they will only be run when that module is being worked on -- a big problem.
People use unit test discovery and execution frameworks so that they can add tests to existing code, execute those tests, and get a simple report, without thinking. Below, you'll see some of the advantages that using such a framework gives you: in addition to finding and running your tests, frameworks can let you selectively execute certain tests, capture and collate error output, and add coverage and profiling information. (You can always write your own framework -- but why not take advantage of someone else's, even if they're not as smart as you?)
"Why use nose in particular?" is a more difficult question. There are many unit test frameworks in Python, and more arise every day. I personally use nose, and it fits my needs fairly well. In particular, it's actively developed, by a guy (Jason Pellerin) who answers his e-mail pretty quickly; it's fairly stable (it's in beta at the time of this writing); it has a really fantastic plug-in architecture that lets me extend it in convenient ways; it integrates well with distutils; it can be adapted to mimic any other unit test discovery framework pretty easily; and it's being used by a number of big projects, which means it'll probably still be around in a few years.
I hope the best reason for you to use nose will be that I'm giving you this extended introduction ;).
First, install nose. Using setuptools, this is easy:
easy_install nose
Now let's start with a few examples. Here's the simplest nose test you can write:
def test_b():
assert 'b' -- 'b'
Put this in a file called test_me.py, and then run nosetests. You will see this output:
. ---------------------------------------------------------------------- Ran 1 test in 0.005s OK
If you want to see exactly what test was run, you can use nosetests -v.
test_stuff.test_b ... ok ---------------------------------------------------------------------- Ran 1 test in 0.015s OK
Here's a more complicated example.
class TestExampleTwo:
def test_c(self):
assert 'c' -- 'c'
Here, nose will first create an object of type TestExampleTwo, and only then run test_c:
test_stuff.TestExampleTwo.test_c ... ok
Most new test functions you write should look like either of these tests -- a simple test function, or a class containing one or more test functions. But don't worry -- if you have some old tests that you ran with unittest, you can still run them. For example, this test:
class ExampleTest(unittest.TestCase):
def test_a(self):
self.assert_(1 -- 1)
still works just fine:
test_a (test_stuff.ExampleTest) ... ok
A fairly common pattern for unit tests is something like this:
def test():
setup_test()
try:
do_test()
make_test_assertions()
finally:
cleanup_after_test()
Here, setup_test is a function that creates necessary objects, opens database connections, finds files, etc. -- anything that establishes necessary preconditions for the test. Then do_test and make_test_assertions acually run the test code and check to see that the test completed successfully. Finally -- and independently of whether or not the test succeeded -- the preconditions are cleaned up, or "torn down".
This is such a common pattern for unit tests that most unit test frameworks let you define setup and teardown "fixtures" for each test; these fixtures are run before and after the test, as in the code sample above. So, instead of the pattern above, you'd do:
def test(): do_test() make_test_assertions() test.setUp = setup_test test.tearDown = cleanup_after_test
The unit test framework then examines each test function, class, and method for fixtures, and runs them appropriately.
Here's the canonical example of fixtures, used in classes rather than in functions:
class TestClass:
def setUp(self):
...
def tearDown(self):
...
def test_case_1(self):
...
def test_case_2(self):
...
def test_case_3(self):
...
The code that's actually run by the unit test framework is then
for test_method in get_test_classes():
obj = TestClass()
obj.setUp()
try:
obj.test_method()
finally:
obj.tearDown()
That is, for each test case, a new object is created, set up, and torn down -- thus approximating the Platonic ideal of running each test in a completely new, pristine environment.
(Fixture, incidentally, comes from the Latin "fixus", meaning "fixed". The origin of its use in unit testing is not clear to me, but you can think of fixtures as permanent appendages of a set of tests, "fixed" in place. The word "fixtures" make more sense when considered as part of a test suite than when used on a single test -- one fixture for each set of tests.)
All of the example code in this article is available in a .tar.gz file. Just download the package at
http://darcs.idyll.org/~t/projects/nose-demo.tar.gz
and unpack it somewhere; information on running the examples is in each section, below.
To run the simple examples above, go to the top directory in the example distribution and type
nosetests -w simple/ -v
nose is a unit test discovery and execution package. Before it can execute any tests, it needs to discover them. nose has a set of rules for discovering tests, and then a fixed protocol for running them. While both can be modified by plugins, for the moment let's consider only the default rules.
nose only looks for tests under the working directory -- normally the current directory, unless you specify one with the -w command line option.
Within the working directory, it looks for any directories, files, modules, or packages that match the test pattern. [ ... ] In particular, note that packages are recursively scanned for test cases.
Once a test module or a package is found, it's loaded, the setup fixtures are run, and the modules are examined for test functions and classes -- again, anything that matches the test pattern. Any test functions are run -- along with associated fixtures -- and test classes are also executed. For each test method in test classes, a new object of that type is instantiated, the setup fixture (if any) is run, the test method is run, and (if there was a setup fixture) the teardown fixture is run.
Here's the basic logic of test running used by nose (in Python pseudocode)
if has_setup_fixture(test):
run_setup(test)
try:
run_test(test)
finally:
if has_setup_fixture(test):
run_teardown(test)
Unlike tests themselves, however, test fixtures on test modules and test packages are run only once. This extends the test logic above to this (again, pseudocode):
### run module setup fixture
if has_setup_fixture(test_module):
run_setup(test_module)
### run all tests
try:
for test in get_tests(test_module):
try: ### allow individual tests to fail
if has_setup_fixture(test):
run_setup(test)
try:
run_test(test)
finally:
if has_setup_fixture(test):
run_teardown(test)
except:
report_error()
finally:
### run module teardown fixture
if has_setup_fixture(test_module):
run_teardown(test_module)
A few additional notes:
- if the setup fixture fails, no tests are run and the teardown fixture isn't run, either.
- if there is no setup fixture, then the teardown fixture is not run.
- whether or not the tests succeed, the teardown fixture is run.
- all tests are executed even if some of them fail.
nose can only execute tests that it finds. If you're creating a new test suite, it's relatively easy to make sure that nose finds all your tests -- just stick a few assert 0 statements in each new module, and if nose doesn't kick up an error it's not running those tests! It's more difficult when you're retrofitting an existing test suite to run inside of nose; in the extreme case, you may need to write a plugin or modify the top-level nose logic to find the existing tests.
The main problem I've run into is that nose will only find tests that are properly named and within directory or package hierarchies that it's actually traversing! So placing your test modules under the directory my_favorite_code won't work, because nose will not even enter that directory. However, if you make my_favorite_code a package, then nose will find your tests because it traverses over modules within packages.
In any case, using the -vv flag gives you verbose output from nose's test discovery algorithm. This will tell you whether or not nose is even looking in the right place(s) to find your tests.
Apart from the plugins, there are only a few options that I use regularly.
nose only looks for tests in one place. The -w flag lets you specify that location; e.g.
nosetests -w simple/
will run only those tests in the directory ./simple/.
As of the latest development version (October 2006) you can specify multiple working directories on the command line:
nosetests -w simple/ -w basic/
See Running nose programmatically for an example of how to specify multiple working directories using Python, in nose 0.9.
By default, nose captures all output and only presents stdout from tests that fail. By specifying '-s', you can turn this behavior off.
nose is intentionally pretty terse. If you want to see what tests are being run, use '-v'.
nose lets you specify a set of tests on the command line; only tests that are both discovered and in this set of tests will be run. For example,
nosetests -w simple tests/test_stuff.py:test_b
only runs the function test_b found in simple/tests/test_stuff.py.
Doctests are a nice way to test individual Python functions in a convenient documentation format. For example, the docstring for the function multiply, below, contains a doctest:
def multiply(a, b): """ 'multiply' multiplies two numbers and returns the result. >>> multiply(5, 10) # doctest: +SKIP 50 >>> multiply(-1, 1) # doctest: +SKIP -1 >>> multiply(0.5, 1.5) # doctest: +SKIP 0.75 """ return a*b
(Ignore the SKIP pragmas; they're put in so that this file itself can be run through doctest without failing...)
The doctest module (part of the Python standard module) scans through all of the docstrings in a package or module, executes any line starting with a >>>, and compares the actual output with the expected output contained in the docstring.
Typically you run these directly on a module level, using the sort of __main__ hack I showed above. The doctest plug-in for nose adds doctest discovery into nose -- all non-test packages are scanned for doctests, and any doctests are executed along with the rest of the tests.
To use the doctest plug-in, go to the directory containing the modules and packages you want searched and do
nosetests --with-doctest
All of the doctests will be automatically found and executed. Some example doctests are included with the demo code, under basic; you can run them like so:
% nosetests -w basic/ --with-doctest -v doctest of app_package.stuff.function_with_doctest ... ok ...
Note that by default nose only looks for doctests in non-test code. You can add --doctest-tests to the command line to search for doctests in your test code as well.
The doctest plugin gives you a nice way to combine your various low-level tests (e.g. both unit tests and doctests) within one single nose run; it also means that you're less likely to forget about running your doctests!
The attrib extension module lets you flexibly select subsets of tests based on test attributes -- literally, Python variables attached to individual tests.
Suppose you had the following code (in attr/test_attr.py):
def testme1():
assert 1
testme1.will_fail = False
def testme2():
assert 0
testme2.will_fail = True
def testme3():
assert 1
Using the attrib extension, you can select a subset of these tests based on the attribute will_fail. For example, nosetests -a will_fail will run only testme2, while nosetests -a \!will_fail will run both testme1 and testme3. You can also specify precise values, e.g. nosetests -a will_fail=False will run only testme1, because testme3 doesn't have the attribute will_fail.
You can also tag tests with lists of attributes, as in attr/test_attr2.py:
def testme5():
assert 1
testme5.tags = ['a', 'b']
def testme6():
assert 1
testme6.tags = ['a', 'c']
Then nosetests -a tags=a will run both testme5 and testme6, while nosetests -a tags=b will run only testme5.
Attribute tags also work on classes and methods as you might expect. In attr/test_attr3.py, the following code
class TestMe:
x = True
def test_case1(self):
assert 1
def test_case2(self):
assert 1
test_case2.x = False
lets you run both test_case1 (with -a x) and test_case2 (with -a \!x); here, methods inherit the attributes of their parent class, but can override the class attributes with method-specific attributes.
nose has a friendly top-level API which makes it accessible to Python programs. You can run nose inside your own code by doing this:
import nose ### configure paths, etc here nose.run() ### do other stuff here
By default nose will pick up on sys.argv; if you want to pass in your own arguments, use nose.run(argv=args). You can also override the default test collector, test runner, test loader, and environment settings at this level. This makes it convenient to add in certain types of new behavior; see multihome/multihome-nose for a script that lets you specify multiple "test home directories" by overriding the test collector.
There are a few caveats to mention about using the top-level nose commands. First, be sure to use nose.run, not nose.main -- nose.main will exit after running the tests (although you can wrap it in a 'try/finally' if you insist). Second, in the current version of nose (0.9b1), nose.run swipes sys.stdout, so print will not yield any output after nose.run completes. (This should be fixed soon.)
As nice as nose already is, the plugin system is probably the best thing about it. nose uses the setuptools API to load all registered nose plugins, allowing you to install 3rd party plugins quickly and easily; plugins can modify or override output handling, test discovery, and test execution.
nose comes with a couple of plugins that demonstrate the power of the plugin API; I've discussed two (the attrib and doctest plugins) above. I've also written a few, as part of the pinocchio nose extensions package.
Here are a few tips and tricks for writing plugins.
read through the nose.plugins.IPluginInterface code a few times.
for the want* functions (wantClass, wantMethod, etc.) you need to know:
- a return value of True indicates that your plugin wants this item.
- a return value of False indicates that your plugin doesn't want this item.
- a return value of None indicates that your plugin doesn't care about this item.
Also note that plugins aren't guaranteed to be run in any particular order, so you have to order them yourself if you need this. See the pinocchio.decorator module (part of pinocchio) for an example.
abuse stderr. As much as I like the logging package, it can confuse matters by capturing output in ways I don't fully understand (or at least don't want to have to configure for debugging purposes). While you're working on your plugin, put import sys; err = sys.stderr at the top of your plugin module, and then use err.write to produce debugging output.
notwithstanding the stderr advice, -vv is your friend -- it will tell you that your test file isn't even being examined for tests, and it will also tell you what order things are being run in.
write your initial plugin code by simply copying nose.plugins.attrib and deleting everything that's not generic. This greatly simplifies getting your plugin loaded & functioning.
to register your plugin, you need this code in e.g. a file called 'setup.py'
from setuptools import setup setup( name='my_nose_plugin', packages = ['my_nose_plugin'], entry_points = { 'nose.plugins': [ 'pluginA = my_nose_plugin:pluginA', ] }, )You can then install (and register) the plugin with easy_install ., run in the directory containing 'setup.py'.
I've been using nose fairly seriously for a while now, on multiple projects. The two most frustrating problems I've had are with the output capture (mentioned above, in Running nose programmatically) and a situation involving the logging module. The output capture problem is easily taken care of, once you're aware of it -- just be sure to save sys.stdout before running any nose code. The logging module problem cropped up when converting an existing unit test suite over to nose: the code tested an application that used the logging module, and reconfigured logging so that nose's output didn't show