improving the join function
Steven Schveighoffer
schveiguy at yahoo.com
Wed Oct 13 12:03:37 PDT 2010
On Mon, 11 Oct 2010 20:33:27 -0400, Andrei Alexandrescu
<SeeWebsiteForEmail at erdani.org> wrote:
> I'm looking at http://d.puremagic.com/issues/show_bug.cgi?id=3313 and
> that got me looking at std.string.join, which currently has the sig:
>
> string join(in string[] words, string sep);
>
> A narrow fix:
>
> Char[] join(Char)(in Char[][] words, in Char[] sep)
> if (isSomeChar!Char);
>
> I think it's reasonable to assume that people would want to join things
> that aren't necessarily arrays of characters, so T could be pretty much
> any type. An obvious step towards generalization is:
>
> T[] join(T)(in T[][] items, T[] sep);
This doesn't quite work if T is not a value type (actually, I think it
does, but only because there are bugs in the compiler).
>
> But join doesn't really need random access for words - really, an input
> range should suffice. So a generally useful join, almost worth putting
> in std.algorithm, would be:
>
> ElementType!R1[] join(R1, R2)(R1 items, R2 sep)
> if (isInputRange!R1 && isForwardRange!R2
> && is(ElementType!R2 : ElementType!R1);
>
> Notice how the separator must be a forward range because it gets spanned
> multiple times, whereas the items need only be an input range as they
> are spanned once. This is at the same time a very general and very
> precise interface.
I think this is fine. Note that this does not take into account the
constancy of items, meaning it is legal for this function to mess with the
original data in items.
Not that I think it's a bad thing, but it does lose some guarantees as
compared to the original join. inout can't be used here because it
doesn't work as a template parameter.
> One thing is still bothering me: the array output type. Why would the
> "default" output range be an array? What can be done to make join() at
> the same time a general function and also one that works for strings the
> way the old join did? For example, if I want to join things into an
> already-existing buffer, or if I want to write them straight to a file,
> there's no way to do so without having an array allocation in the loop.
> I have a couple of ideas but I wouldn't want to bias yours.
Well, one could have a version of join that takes an output range. It
would have to return the output range instead of the *result* of the
output range. And in that case, the standard join which returns an array
can be implemented:
ElementType!R1[] join(R1 items, R2 sep) ...
{
return join(R1, R2, Appender!(ElementType!R1)).data;
}
> I also have a question from people who dislike Phobos. Was there a point
> in the changes of signature above where you threw your hands thinking,
> "do the darn string version already and cut all that crap!"?
It's not a problem with phobos, it's a problem with documentation. There
is a fundamental issue with documenting complex templates which makes
function signatures very difficult to understand. The doc generator can
and should simplify things, and I think at some point we should address
this. In other words, it should be transformed into a form that's easy to
see that it's the same as string[] join(string[][], string[]).
-Steve
More information about the Digitalmars-d
mailing list