V2 string
Bruno Medeiros
brunodomedeiros+spam at com.gmail
Thu Jul 5 07:27:13 PDT 2007
Frits van Bommel wrote:
> Bruno Medeiros wrote:
>> Regan Heath wrote:
>>> tolower is an interesting case. As a caller I expect it to modify
>>> the string, or perhaps give a modified copy back (both options are
>>> valid and should perhaps be supported?).
>>>
>>> So, the 'string tolower(string)' version has 2 cases, the first case
>>> where it doesn't need to modify the input and can simply return it,
>>> no problem. But case 2, where it does modify it should dup and return
>>> char[]. My reasoning being that after it has completed and returned
>>> the copy, the caller now 'owns' the string (as it's the only copy in
>>> existance and no-one else has a reference to it).
>>>
>>
>> Indeed, I think this illustrates that some standard library functions
>> may not have the correct signature, and I tolower is likely one of them.
>> The most general case for tolower is:
>> char[] tolower(const(char)[] s);
>> Since tolower creates a new array, but does not keep it, it can give
>> away it's ownership of the the array (ie, return a mutable).
>
> Sorry, but you seem to have missed a bit above: if the string doesn't
> contain any uppercase characters tolower returns the input without
> ..dup-ing it (aka copy-on-write).
>
Oops, sorry, that's right, I missed that part about tolower not
modifying the string if it wasn't necessary. :(
>> The second case, more specific, is simply syntactic sugar for making
>> that array invariant:
>>
>> invariant(char)[] tolowerinv(const(char)[] str) {
>> return cast(invariant) tolower(str);
>> }
>
> Yes, but only if it actually needs to modify the string.
>
> You seem to have missed that the two cases can't (in general) be
> distinguished at compile time; it's only at run time when a choice is
> made between a copy and no copy.
>
>> The current signature:
>> const(char)[] tolower(const(char)[] str)
>> is kinda incorrect, because it returns a const reference for an array
>> that has no mutable references, and that is the same as an invariant
>> reference, so tolower might as well return invariant(char)[].
>
> Again, that only holds if a copy was actually made at run time. If no
> copy was made the original input is returned, to which there may be
> mutable references.
You're right, if a copy is not made *every* time (which is the case
after all), then the above doesn't hold.
But then, what I think is happening is that Phobo's current tolower is
suboptimal in terms of usefulness, because the fact that we don't know
if a new copy is made or not. I'm wondering now what would be the more
useful form, or forms, of tolower (and similar functions) to have.
Now that I think of it again (admittedly I haven't got much experience
with string manipulation in C++ or D, though), but perhaps the best form
is an in-place mutable version:
char[] tolower(char[] str);
And it's this one after all that is the most general form. If you want
to call tolower on a const or invariant array you dup it yourself on the
call:
char[] str = tolower("FOO".dup);
--
Bruno Medeiros - MSc in CS/E student
http://www.prowiki.org/wiki4d/wiki.cgi?BrunoMedeiros#D
More information about the Digitalmars-d
mailing list