Splitter quiz / survey
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Mon Apr 27 04:53:26 PDT 2009
Brad Roberts wrote:
> Without looking at the docs, code, or compiling and running a test, what will
> this do:
>
> foreach(x, splitter(",a,b,", ","))
> writefln("x = %s", a);
>
> I'll make it multiple choice:
>
> choice 1)
> x = a
> x = b
>
> choice 2)
> x =
> x = a
> x = b
>
> choice 3)
> x = a
> x = b
> x =
>
> choice 4)
> x =
> x = a
> x = b
> x =
>
> Later,
> Brad
Thanks for bringing this to attention, Brad. Splitter does what Perl's
split does: 2. This means comma is an item terminator and not an item
separator. Why did I think this is a good idea? Because in most cases, I
was thankful to Perl's split that it does exactly the right thing.
Whenever I read text from linguistic corpora, I see that words (or other
word properties) are separated by spaces. There is never a space before
the first word on a line, but there is often a trailing space at the end
of the line. Why? Because the text was processed by a program that
output "word, ' '" or "tag, ' '" for each word of tag. Then if I split
the text by whitespace, I'd be annoyed to see that trailing spaces do
matter.
For the same reason, C accepts enum X { a, b, } but not ,a ,b.
Mechanically generating enum values is easier if each value has a
trailing comma.
Similarly, when you split a text by '\n', a leading empty line is
important, whereas you wouldn't expect a final '\n' to introduce an
empty line.
Now clearly there are cases in which leading or trailing empty items are
both important. I'm just saying they are more rare. We could add an
enumerated parameter to Splitter:
enum PleaseFindAGoodName { terminator, separator }
foreach (line; splitter(",a,b,", ","))
... terminator is implicit ...
foreach (line; splitter(",a,b,", ",", PleaseFindAGoodName.separator))
... separator ...
We might just go with the terminator semantics and ask people who need
separator semantics to use a stripl() or a munch() prior to splitting.
I'd personally prefer having an enum there.
Andrei
More information about the Digitalmars-d
mailing list