The new std.process is ready for review

Sat Feb 23 14:46:04 PST 2013

On Sat, Feb 23, 2013 at 03:15:26PM -0500, Steven Schveighoffer wrote:
> On Sat, 23 Feb 2013 11:42:26 -0500, H. S. Teoh
> <hsteoh at quickfur.ath.cx> wrote:
> 
> >- wait():
> >   - Some code examples would be nice.
> >
> >   - For the POSIX-specific version, I thought the Posix standard
> >     specifies that the actual return code / signal number should be
> >     extracted by means of system-specific macros (in C anyway)?
> >     Wouldn't it be better to encapsulate this in a POD struct or
> >     something instead of exposing the implementation-specific values
> >     to the user?
> 
> We handle the extraction as an implementation detail, the result
> should be cross-platform (at least on signal-using platforms).  I
> don't know what a POD struct would get you, maybe you could elaborate
> what you mean?

Oh, I thought the return value was just straight from the syscall, which
requires WIFEXITED, WEXITSTATUS, WCOREDUMP, etc., to interpret. If it
has already been suitably interpreted in std.process, then I guess it's
OK.

Otherwise, I was thinking of encapsulating these macros in some kind of
POD struct, that provides methods like .ifExited, .exitStatus,
.coreDump, etc. so that the user code doesn't have to directly play with
the exact values returned by the specific OS.

> >   - How do I wait for *any* child process to terminate, not just a
> >     specific Pid?
> 
> I don't think we have a method to do that.  It would be complex,
> especially if posix wait() returned a pid that we are not handling!
> 
> I suppose what you could do is call posix wait (I have a feeling we
> may need to change our global wait function, or eliminate it), and
> then map the result back to a Pid you are tracking.
> 
> You have any ideas how this could be implemented?  I'd prefer not to
> keep a global cache of child process objects...

Why not?  On Posix at least, you get SIGCHLD, etc., for all child
processes anyway, so a global cache doesn't seem to be out-of-place.

But you do have a point about pids that we aren't managing, e.g. if the
user code is doing some fork()s on its own. But the way I see it,
std.process is supposed to alleviate the need to do such things
directly, so in my mind, if everything is going through std.process
anyway, might as well just manage all child processes there. OTOH, this
may cause problems if the D program links in C/C++ libraries that manage
their own child processes.

Still, it would be nice to have some way of waiting for a set of child
Pids, not just a single one. It would be a pain if user code had to
manually manage child processes all the time when there's more than one
of them running at a time.

Hmm. The more I think about it, the more it makes sense to just have
std.process manage all child process related stuff. It's too painful to
deal with multiple child processes otherwise. Maybe provide an opt-out
in case you need to link in some C/C++ libraries that need their own
child process handling, but the default, IMO, should be to manage
everything through std.process.

> >- execute() and shell(): I'm a bit concerned about returning the
> >  *entire* output of a process as a string. What if the output
> >  generates too much output to store in a string? Would it be better
> >  to return a range instead (either a range of chars or range of
> >  lines maybe)? Or is this what pipeProcess was intended for? In any
> >  case, would it make sense to specify some kind of upper limit to
> >  the size of the output so that the program won't be vulnerable to
> >  bad subprocess behaviour (generate infinite output, etc.)?
> 
> Yes, pipeProcess gives you File objects for each of the streams for
> those cases where you expect lots of data to be returned, or want to
> process it as it comes.  This is the use case I expect most people
> will use.
> 
> There is no doubt good use cases for execute/shell, we have a lot of
> non-generic string processing functions in phobos, and a lot of
> command line tools on an OS produce a concise output that can be used.

True.

> In general, for input streams, ranges are not a good interface.
> Output ranges are good for output though, and I think File is a valid
> output range.

True.

> >- ProcessException: are there any specific methods to help user code
> >  extract information about the error? Or is the user expected to
> >  check errno himself (on Posix; or whatever it is on Windows)?
> 
> This is a good idea.  Right now, ProcessException converts the errno
> to a string message, but we could easily store the errno.
> 
> I say we, but I really mean Lars, he has done almost all the work :)
[...]

I never liked the design of errno in C... its being a global makes
keeping track of errors a pain. It would be nice if the value of errno
were saved in the Exception object at the time the error was
encountered, instead of arbitrary amounts of code after, which may have
changed its value.

T

-- 
GEEK = Gatherer of Extremely Enlightening Knowledge