Why Strings as Classes?

Wed Aug 27 09:25:08 PDT 2008

On 2008-08-27 16:16:49 +0200, bearophile <bearophileHUGS at lycos.com> said:

> Fawzi Mohamed:
>> it also depend on
>> the language, if in a language it is clear that using the default
>> structure you are not supposed to append often, and if you really have
>> to do it you use a special method if you do, then the programs written
>> in that language will use that.
> 
> I agree. But then the D specs have to be updated to say that the append 
> to built-in D arrays is a slow or very slow operation (not amortized 
> O(1)), so people coming from the C++ STL, Python, Ruby, TCl, Lua, Lisp, 
> Clean, Oz, ecc will not receive a bite from it.
> 
> 
>> D tries to do different things to mitigate the fact that appending is
>> difficult, maybe it could do more, but it will *never* be as efficient
>> as a structure that gives up that fact that an array has to be just a
>> chunk of contiguous memory.
> 
> I agree, a deque will probably be always faster in appending than a 
> dynamic array.
> But I think D may do more here :-)
> 
> 
>> Now I find the choice of having the basic array being just a chunk of
>> contiguous memory, and that the overhead of the structure should be
>> minimal very reasonable for a system programming language that has to
>> interact with C
> 
> Note that both vector of the C++ STL and "list" of Python are generally 
> (or always) implemented with a chunk of contiguous memory.
> 
> 
>> Clearly someone coming from lisp, any functional language or even
>> python, might disagree about what is requested from the "default" array
>> container.
>> It isn't that one is right and the other wrong, it is just a question
>> of priorities of the language, the feel of it, its style...
> 
> I agree, that's why I have suggested to collect statistics from real D 
> code, and not from Lisp programs, to see what's the a good performance 
> profile compromise for D built-in dynamic arrays :-) This means that 
> I'll stop caring for a fast array append in D if most D programmers 
> don't need fast appends much.
> 
> 
>> This does not mean that if improvements can be done without
>> compromising too much it shouldn't be done, just that failing some
>> benchmarks might be ok :)
> 
> Well, in my opinion that's okay, but there's a limit. So I think a 
> built-in data structure has to be optimized for being flexible, so 
> while not being very good in anything, it has to be not terrible in any 
> commonly done operation.
> On the other hand, I can see what you say, that in a low level language 
> a too much flexible data structure may be not fit as built-in, while a 
> simpler and less flexible one may be fitter. You may be right and I may 
> be wrong on this point :-)

well it is funny because now I am not too sure anymore that adding an 
extra field (pointing to the end of the reserved data, if the actual 
array is the "owner" of it and to before the start if it isn't and one 
should reallocate (i.e. for slices) is such a bad idea.

I started out with the idea "slightly improved C characteristics", and 
so for me it was clear that appending would be bad, and I was actually 
surprised (using it) in seeing that it was less bad that I supposed.
The way it is has the advantage of allowing bound checks, and a very 
little overhead, but it has no concept of "extra grow space".
So for me it was clear that if I wanted to insert something I had to 
make place first, and then insert it (keeping in mind the old size).

The vector approach (do know about the capacity) can be useful in high 
level code where one wants that a.length always is the length of the 
array (not maybe longer because of some reserved memory).

So maybe it is worthwhile, even if it will for sure add some bloat, I 
do not know...
For most of my code it will just add bloat, but probably a tolerable one

Fawzi