Relaxing the definition of isSomeString and isNarrowString

Dmitry Olshansky via Digitalmars-d digitalmars-d at puremagic.com
Sun Aug 24 11:40:01 PDT 2014


24-Aug-2014 21:59, Andrei Alexandrescu пишет:
> On 8/24/14, 6:16 AM, Dmitry Olshansky wrote:
>> 24-Aug-2014 16:24, Andrei Alexandrescu пишет:
>>>
>> Speaking of data-structures I find just about the opposite. Most data
>> structure are small, which must be the fact so fondly used by C++
>> vector: small-string optimization. Only very few data-structures are
>> large in a given program, and usually correspond to some global tables
>> and repositories. Others are either short lived byproduct of input
>> processing or are small data-sets attached to some global entity.
>
> I don't know of any std::vector that uses the small string optimization.

This time it's me who must be wrong.

Yet I see that this is recognized need:
https://github.com/facebook/folly/blob/master/folly/small_vector.h
LLVM folks seem to do the same.

With that in mind small containers might be better as a special value type.

> std::string does ubiquitously because (a) strings are often handled as
> values, and (b) C++11 put refcounted strings into illegality (forced
> mistake) therefore robbing implementers of an important optimization.
>
> In a way both C++ and D got it "wrong". Arrays/containers are entity
> types - they have identity and should be manipulated most often by
> reference. Presence of pass-by-value of containers in C++ programs, save
> for rvalue optimization purposes, is suspicious.

Agreed.

> In contrast, strings
> are value types - they are handled most often as a unit and passed by
> value, just like e.g. numbers.

Indeed, just note that it would be real nice not to actually copy 
strings. In fact it seems that copy-on-write (with ref-counting) for big 
strings and small string optimization for small is almost ideal 
solution. In D we could have non-atomic (thread-local) copy-on-write, 
which should be quite fast.

> C++ made both containers and strings value types, so it needs forever to
> look over its shoulder about n00bs copying large containers unwittingly.
> It also does a fair amount of unneeded string copying, and optimizing
> string-based C++ code is nontrivial. D made both arrays and strings
> slices, a data structure made highly expressive by the garbage collector
> but that occasionally confuses people. With std.refcounted.RCString and
> std.container.Array we get both abstractions "right".
>

Good points.

-- 
Dmitry Olshansky


More information about the Digitalmars-d mailing list