phobos and splitting things... but not with whitespace.

Chad J chadjoan at __spam.is.bad__gmail.com
Sat Jun 23 11:41:29 PDT 2012


On 06/23/2012 02:17 PM, simendsjo wrote:
> On Sat, 23 Jun 2012 19:52:32 +0200, Chad J
> <chadjoan at __spam.is.bad__gmail.com> wrote:
>
>>
>> As an additional note: I could probably do this easily if I had a
>> function like findSplit where the predicate is used /instead/ of a
>> delimiter. So like this:
>> auto findSplit(alias pred = "a", R)(R haystack);
>> ...
>> auto tuple = findSplit!(`a == "\n" || a == "\r\n" || a == "\r"`)(text);
>> return tuple[2];
>
> I don't think it can match on ranges, but it's pretty trivial to
> implement something that would work for your case
>
> import std.array, std.algorithm, std.typecons;
>
> auto newlineSplit(string data) {
> auto rest = data.findAmong("\r\n");
> if(!rest.empty) { // found
> auto pre = data[0..data.length-rest.length];
> string match;
> if(rest.front == '\r' && (rest.length > 1 && rest[1] == '\n')) { // \r\n
> match = rest[0..2];
> rest = rest[2..$];
> } else { // \r or \n
> match = rest[0..1];
> rest = rest[1..$];
> }
> return tuple(pre, match, rest);
> } else {
> return tuple(data, "", "");
> }
> }
> unittest {
> auto text = "1\n2\r\n3\r4";
> auto res = text.newlineSplit();
> assert(res[0] == "1");
> assert(res[1] == "\n");
> assert(res[2] == "2\r\n3\r4");
>
> res = res[2].newlineSplit();
> assert(res[0] == "2");
> assert(res[1] == "\r\n");
> assert(res[2] == "3\r4");
>
> res = res[2].newlineSplit();
> assert(res[0] == "3");
> assert(res[1] == "\r");
> assert(res[2] == "4");
>
> res = res[2].newlineSplit();
> assert(res[0] == "4");
> assert(res[1] == "");
> assert(res[2] == "");
> }

Hey, thanks for doing all of that.  I didn't expect you to write all of 
that.

Once I've established that the issue isn't just a lack of learning on my 
part, my subsequent objective is filling any missing functionality in 
phobos.  IMO the "take away a single line" thing should be 
accomplishable with a single concise expression.  Then there should be a 
function in std.string that contains that single expression and wraps it 
in easy-to-find documentation.  This kind of thing is a fairly common 
operation.  Otherwise, I find it odd that there is a function to split 
up an arbitrary number of lines but no function to split off only one!

Also, any function that works with whitespace should have 
versions/variants that work with arbitrary delimiters.  Not unless it is 
impossible to generalize it that way for some reason.  If the variants 
are found in a separate module, then the documentation should reference 
them.


More information about the Digitalmars-d-learn mailing list