COW vs. in-place.
Dave
Dave_member at pathlink.com
Mon Jul 31 16:01:14 PDT 2006
Kirk McDonald wrote:
> Derek wrote:
>> On Mon, 31 Jul 2006 16:40:54 -0500, Dave wrote:
>>
>>
>>> Not a bad idea... The main prob. would be that there would be a lot
>>> of duplication of code.
>>
>>
>> void toUpper_inplace(char[] x)
>> {
>> . . .
>> }
>>
>> char[] toUpper(char[] x)
>> {
>> char[] y = x.dup;
>> toUpper_inplace(y);
>> return y;
>> }
>>
With this one, you're always dup'ing instead of .dup'ing only when
needed (the current one is actually more efficient).
>
> I've got one better. Say we have a whole bunch of inplace string
> functions, like the one above and this one:
>
> void toLower_inplace(char[] x) {
> // ...
> }
>
> and others. Then we can:
>
> char[] cow_func(alias fn)(char[] x) {
> char[] y = x.dup;
> fn(y);
> return y;
> }
>
> alias cow_func!(toUpper_inplace) toUpper;
> alias cow_func!(toLower_inplace) toLower;
>
> Etc. Obviously, you'd have to provide a different template for each
> function footprint, but the string library has a lot of repeated
> footprints.
>
I think to maximize code re-use you'd have to build the "COW or not to
COW" logic into the "base" function. And if you did that you'd have to
live with a little more function call overhead (passing a bool or small
enum around) in order to avoid the defensive copying like in cow_func above.
I'm wondering - if Phobos would have been built that way (making it the
'D way' of doing things), would all the concerns about GC performance
and "const" have been so acute over the last year or so (hind-sight is
always closer to 20-20 of course)?
The problem w/ all the dup'ing is when you put something like this in a
tight loop you get sloooowwwww code:
import std.file, std.string, std.stdio;
void main()
{
char[][] formatted;
char[][] text = split(cast(char[])read("largefile.txt"), ".");
foreach(char[] sentence; text)
{
formatted ~= capitalize(tolower(strip(sentence))) ~ ".\r\n";
}
//...
foreach(char[] sentence; formatted)
{
writefln(sentence);
}
}
None of those functions (except for read()) would really have to do much
allocating because the input file for all intents and purposes is
read-only here (it won't get implicitly modified even if COW isn't used).
- Dave
More information about the Digitalmars-d
mailing list