A better way to write this function? (style question)
John Colvin
john.loughran.colvin at gmail.com
Mon Dec 30 14:30:01 PST 2013
On Monday, 30 December 2013 at 22:17:21 UTC, John Colvin wrote:
> On Monday, 30 December 2013 at 21:40:58 UTC, Thomas Gann wrote:
>> I've written a Markov bot in D, and I have function whose job
>> it is to take an input string, convert all newline characters
>> to spaces and all uppercase letters to lowercase, and then
>> return an array of words that are generated by splitting the
>> string up by whitespace. Here is the function is question:
>>
>> string[] split_sentence(string input)
>> {
>> string line;
>>
>> foreach(c; input)
>> {
>> if(c == '\n' || c == '\r')
>> line ~= ' ';
>>
>> else
>> line ~= c.toLower();
>> }
>>
>> return line.splitter(' ').filter!(a => a.length).array;
>> }
>>
>> Obviously, one issue is that because the string is immutable,
>> I can't modify it directly, and so I actually build an
>> entirely new string in place. I would have just made a mutable
>> duplicate of the input and modify that, but then I would get
>> errors returning, because it expects string[] and not
>> char[][]. Is there a more elegant way to do what I'm doing?
>
>
> A few points:
>
> by declaring a new string and appending to it you are risking a
> lot of allocations. Either use std.array.appender or allocate
> the array with the correct size at the beginning.
>
> using .array on the end of the ufcs chain is yet another
> allocation. It can be avoided using std.algorithm.copy to copy
> the result back in to 'line'
>
> In my opinion the whole API would be better as range-based:
>
> auto splitSentence(R)(R input)
> if(isInputRange!R)
> {
> return input
> .map!(c => (c == "\n"[0] || c == "\r"[0]) ? ' ' :
> c.toLower)
> .splitter!(' ')
> .filter!(a => !(a.empty));
> }
sorry, ignore that attempt, it's woefully broken...
More information about the Digitalmars-d-learn
mailing list