V2 string (general issues)

Kristian Kilpi kjkilpi at gmail.com
Thu Jul 5 08:26:27 PDT 2007


On Thu, 05 Jul 2007 01:18:28 +0300, Derek Parnell <derek at psych.ward> wrote:
> I'm converting Bud to compile using V2 and so far its been a very hard
> thing to do. I'm finding that I'm now having to use '.dup' and '.idup'  
> all
> over the place, which is exactly what I thought would happen. Bud does a
> lot of text manipulation so having 'string' as invariant means that calls
> to functions that return string need to often be .dup'ed because I need  
> to
> assign the result to a malleable variable.
>
> I might have to rethink of the design of the application to avoid the
> performance hit of all these dups.
>

That got me thinking about string functions in general.

First, I am wondering why some functions are formed as follows:
(but I'm sure someone will (hopefully) enlight me about that ;) )

   string foo(string bar);

That is, if they return something else than 'bar' (they do some string  
manipulation).
Shouldn't they return char[] instead? For example:

   char[] foo(string bar) {
     return bar ~ "blah";
   }


And this brings us to the 'tolower()' function (for instance).

Sometimes it .dups and sometimes it doesn't. So, if I don't know if the  
input string
contains upper cased chars, I have to .dup the return value, even if it  
may already
been .dupped by 'tolower()'...

   char[] a = "abc".dup;
   char[] b = tolower(a).dub;  //.dupped once ('tolower()' returns plain  
'a')

   char[] a = "ABC".dup;
   char[] b = tolower(a).dub;  //.dupped twice!

So 'tolower()' is a hybrid of two function groups:
(1) functions that modify the input string,
(2) functions that returns a (modified) copy of the input string.

(If the input string doesn't contains upper cased chars it behaves like (1)
(even if it doesn't actually modify the input string), otherwise it  
behaves like (2).)

I don't think this is a good thing.
There should be two different functions, one for each group:

   char[] tolower(char[] str);  //modifies and returns 'str'

   char[] getlower(string str);  //returns a copy


If one likes the copy-on-write behaviour of 'tolower(), I think it would
work only by using reference counting.

For example (the 'String' class uses reference counting):

   String a, b;

   a = "abc";
   b = tolower(a);  //'b' points to 'a' ('tolower()' simply returns 'a')

   b[0] = 'x';  //'b' .dups its contents before modification, so 'a' is not  
changed



More information about the Digitalmars-d mailing list