Briefly, the idea is to add three new functions to subprocess:
output = get_output(cmd, input=None, cwd=None, env=None): (status, output) = get_status_output(cmd, input=None, cwd=None, env=None) (status, output, errout) = get_status_output_errors(cmd, input=None, cwd=None, env=None)
with the goal of replacing commands.getstatusoutput and commands.getoutput. (commands.getstatus has already been removed from 2.6.)
This will provide a simple set of functions for some very common subprocess use-cases, as well as providing for a cross-platform alternative to commands, with better post-fork behavior and error trapping, adhering to PEP 8coding standards. A win-win-win, I hope ;).
In addition to writing the basic code & some tests, I would like to:
- reorganize, correct, and expand the subprocess documentation: right now it's not as useful as it could be.
- put some warnings/error reporting into subprocess for bad class parameters; e.g. Popen.communicate should check to be sure both subprocess.stdout and stderr are PIPEs.
- anything else I should think about doing to subprocess?
- right now the functions take only the input, cwd, and env arguments to pass through to the Popen constructor. Any other favorite arguments out there?
- should language be added to the popen2 module pointing people at subprocess, and should popen2 be deprecated?
- GvR suggested that I reimplement commands in terms of these subprocess functions for 2.6, even though the commands module could be deprecated in 2.6 and probably removed in 2.7. I would rather simply amend the documentation to point people at subprocess.
p.s. The implementation of the above functions is dead simple:
def get_status_output(cmd, input=None, cwd=None, env=None): pipe = Popen(cmd, shell=True, cwd=cwd, env=env, stdout=PIPE, stderr=STDOUT) (output, errout) = pipe.communicate(input=input) assert not errout status = pipe.returncode return (status, output) def get_status_output_errors(cmd, input=None, cwd=None, env=None): pipe = Popen(cmd, shell=True, cwd=cwd, env=env, stdout=PIPE, stderr=PIPE) (output, errout) = pipe.communicate(input=input) status = pipe.returncode return (status, output, errout) def get_output(cmd, input=None, cwd=None, env=None): return get_status_output()
Posted by Drew Perttula on 2007-03-21 at 18:40.
Do none of the proposals involve raising an exception when a command returns nonzero exit status? Returning status codes is so unpythonic (for the common case where commands use nonzero status to mean failure). I'd like it if get_output raised an exception for nonzero status. All the output **and** the particular nonzero status should be attributes on the exception object. get_status_output is now less relevant, but maybe it could stick around as the non-exception- raising version, the one you use if your command returns various success results through the exit status. Optionally, get_output could have an extra arg success_status=0 which you use if you actually want a different status.
Posted by Scott Tsai on 2007-03-21 at 19:15.
I think you mean to do pipe.wait() instead of just reading pipe.returncode. How about allowing subprocess.Popen's other keyword arguments to passed down from get_status_X ?
Posted by Titus Brown on 2007-03-21 at 19:31.
Drew, the idea is to replace commands.* with functions that do substantially the same thing, only "better". Given that returncode==0 on success is only a convention, and not a requirement, I don't know if raising an exception is the right thing to do. Scott, no, what I wrote works fine ;). communicate() automatically does a wait(). As for passing through other keyword args, I'd have to filter them to make sure they made sense. In particular, stdout, stderr, and bufsize don't make sense in this context; preexec_fn and close_fds are UNIX specific; and creationflags and startupinfo are Windows specific. That leaves only universal newlines, which isn't relevant if you're gathering all the output at once and returning it as a string. So I could allow all **but** universal_newlines, stdout, stderr, and bufsize to be passed through, but that complicates the function signatures. hmm. Not sure what to think about that. --titus
Posted by Scott Tsai on 2007-03-21 at 19:52.
Titus, thanks for the explaination. Still thik the platform specific features like using preexec_fn to set resource limits is reall useful though. "Raise an exception if return code is non zero" is also something I use almost daily. This is useful when integrating python into an existing build system. Stopping on the first non zero return code matches the convention of the unix 'make' utility. I write a lot of test code for embedded device drivers and circuit board hardware that executes external commands. I have a function 'run_cmd' that is just subprocess.call but raises exception objects with 'return_code' attributes on error.
Posted by Titus Brown on 2007-03-21 at 20:01.
Scott, OK, you've convinced me on the keyword args. I'll write up the functions that way and see if python-dev brutally rejects them. I understand you on the exception-raising behavior, but that would have to be a new function that acts in a style different from the functions already in the module. I can virtually guarantee that getting that past anyone will be tough ;). --titus
Posted by Kimutaku on 2007-03-21 at 21:31.
I don't get it. Subprocess has a generic api so it can do (almost) anything that can be done with commands, os.system, os.popen*, os.spanw*(???), etc. One problem is that the subprocess docs aren't so nice. Another is that doing a simple process task seems to be a little verbose. So, for consistency, I would have expected just a bit more of documentation, just saying how you could do 'commands' tasks with subprocess (i.e., add another subsection to 17.1.3) If you're going to add some utilities, maybe it'd be better to make a submodule called 'utils', 'shortcuts' or whatever, so you could put functions that can resemble the 17.1.3 section from the docs: from subprocess.shorcuts import get_status_output from subprocess.shorcuts import system ... Ok, maybe a sub-namespace is overkiller but I think the key here is that there's some "utilities" living on the docs (section 17.1.3) and now there would be others explicitly coded. That's inconsistent from my POV. About deprecating popen2. Absolutely. Indeed pep 3108 propose just that. ***Thank you*** for working on this. It would be nice if another python guru could make something similar for the httplib/httplib2(??)/urllib/urllib2/urlparse/whatever issue, for the mac-related modules, etc... Step by step, cleaning up the mess :)
Posted by Titus Brown on 2007-03-22 at 00:51.
Kimutaku, I think it's really valuable to implement very common use-cases. That's the main point -- not to reimplement stuff per se, but rather to encode in a few simple functions what 95% of people using subprocess need to do. In this particular case, we want to keep functions around that people use, but put them in a better place and implement them more nicely. The real reason to keep those functions, though, is not because they're already there, but because they're **used** a lot. --titus
Posted by Nathan LeZotte on 2007-03-22 at 01:01.
Just a few of things: 1. I just noticed (after following the link to the subprocess documentation above), that a check_call function seems to have been added to the subprocess module in Python 2.5. It looks like it's exactly the same as the subprocess.call function except that it has the exception raising behavior discussed above. 2. Is the unverisal newlines argument really irrelevant for these functions? My impression was that it would determine whether you got just '\n' characters for newlines in the resulting output string or something else (like '\r\n' on Windows). Some quick testing shows this to be the case (at least on Windows XP with Python 2.5). 3. Any thoughts on adding some asynchronous output getting functions to the mix? Perhaps with an iterator interface like the following: for line in get_output_line_iter(['ls', '-l']): print process_line(line) I can see two use cases for this functionality: 1. You have a subprocess that produces a lot of output (more than you want to keep in memory at the same time). 2. You have a long running subprocess and you want to process its output before it's finished (e.g. in realtime).
Posted by Mark Eichin on 2007-03-22 at 01:25.
0 == success may be a convention, but it's a strong posix one. I would convert basically all of my commands.getstatusoutput calls to an exception throwing near-alternative; most of them are already followed by "assert st == 0" or the equivalent anyway... it would preserve the readability advantages that commands has over subprocess now, while having the more pythonic "anything goes unexpected/wrong and you get a traceback" reliability that makes constructs like for line in file(...): cleaner and safer than their perl or C equivalents...
Posted by Drew Perttula on 2007-03-22 at 13:25.
Another variation to my proposal: def get_output(cmd, input=None, cwd=None, env=None, success=None) That would be backwards compatible, but if I supply success=0, then it raises an exception if the return status is not zero. And it sounds like some of us would be using success=0 a whole lot. Writing these 3-line versions of a function call is so ridiculously unpythonic: (status, output) = get_status_output("tool") if status != 0: raise ValueError("tool failed") and wrapping that in a library function (like many of us do) seems somewhat batteries-not-included.
Posted by Titus Brown on 2007-03-22 at 16:16.
Hmm, maybe I'll propose a "require_success" bool parameter on all these functions. By default it'll be False. However, if get_output(..., require_success=True) is called and the returncode is not zero, a CalledProcessError will be raised. --titus
Posted by Titus Brown on 2007-03-22 at 16:23.
Nathan, I'll check on the universal newlines bit. My impression was that it changed behavior only if you were reading line-by-line from Popen.stdout/Popen.stderr. Re asynch, my impression is that Popen does a fine job of this already with stdout=PIPE. Am I wrong?
Posted by Titus Brown on 2007-03-22 at 16:24.
(sorry if I ignored something, I'm finding it difficult to take into account all the suggestions; so e-mail me if I forgot to answer something! And thanks for all the suggestions!) email@example.com
Posted by Titus Brown on 2007-03-22 at 16:40.
See <a href="http://mail.python.org/pipermail/python- dev/2007-March/072278.html">http://mail.python.org/pipermail/python- dev/2007-March/072278.html</a>
Posted by Nathan LeZotte on 2007-03-23 at 00:38.
Titus, You may be right about the asynchronous stuff. In the past, I've written some semi-complicated Win32 API code (using the win32all modules) in order to handle subprocess output asynchronously. However, it's entirely possible that I just overlooked the obvious solution: for line in popen_obj.stdout: process_line(line) I'll have to play around with this a bit to see if it doesn't work the way I want it to.
Posted by Kumar McMillan on 2007-03-30 at 11:12.
@Nathan: what you want for realtime iteration is: while 1: line = popen_obj.stdout.readline() if not line: # becomes None on EOF break process_line(line) @Titus: how about a function cmdline2list() to complement subprocess.list2cmdline() ?? I found that I had to implement this for a test recently when I was trying to simulate the way python (gnu readline?) turns a command line into a list (the creation of sys.argv). Is there already a function for this somewhere? There are some funny rules, like --query="title='Foo'" becomes ['--query=title=\'Foo\''] and optparse is unhappy unless it gets exactly that! hey, thanks for working on this module. I think it has been one of the most useful additions to stdlib but of course could still use some work. I agree with above that urllib2 and urlparse desperately need some work as well.
Posted by John Reese on 2007-04-22 at 17:38.
Titus: commands.mkarg is occasionally useful in code building up large command-lines. Do you plan to move that to subprocess as well? subprocess.list2cmdline is similar but Windows-specific. Kumar: > the way python (gnu readline?) turns a > command line into a list It's neither Python nor readline but the shell that's responsible for that. You can simulate it with shlex.split. >>> shlex.split('--query="title=\'Foo\'"') ["-- query=title='Foo'"]
Posted by Richard Philips on 2007-06-06 at 07:35.
Thanks for your work on an already excellent subprocess module. One of the thing I would like to see in subprocess.py is fail proof stdin, stdout, stderr redirection.
Posted by Titus Brown on 2007-06-06 at 11:40.
Richard, those are already in there... --titus