Small Buffer Optimization for string and friends
Michel Fortin
michel.fortin at michelf.com
Sun Apr 8 07:59:15 PDT 2012
On 2012-04-08 05:56:38 +0000, Andrei Alexandrescu
<SeeWebsiteForEmail at erdani.org> said:
> Walter and I discussed today about using the small string optimization
> in string and other arrays of immutable small objects.
>
> On 64 bit machines, string occupies 16 bytes. We could use the first
> byte as discriminator, which means that all strings under 16 chars need
> no memory allocation at all.
>
> It turns out statistically a lot of strings are small. According to a
> variety of systems we use at Facebook, the small buffer optimization is
> king - it just works great in all cases. In D that means better speed,
> better locality, and less garbage.
Small buffer optimization is a very good thing to have indeed. But… how
can you preserve existing semantics? For instance, let's say you have
this:
string s = "abcd";
which is easily representable as a small string. Do you use the small
buffer optimization in the assignment? That seems like a definitive yes.
But as soon as you take a pointer to that string, you break the
immutability guaranty:
immutable(char)[] s = "abcd";
immutable(char)* p = s.ptr;
s = "defg"; // assigns to where?
There's also the issue of this code being legal currently:
immutable(char)* getPtr(string s) {
return s.ptr;
}
If you pass a small string to getPtr, it'll be copied to the local
stack frame and you'll be returning a pointer to that local copy.
You could mitigate this by throwing an error when trying to get the
pointer to a small string, but then you have to disallow taking the
pointer of a const(char)[] pointing to it:
const(char)* getPtr2(const(char)[] s) {
return s.ptr;
}
const(char)* getAbcdPtr() {
string s = "abcd";
// s implicitly converted to regular const(char)[] pointing to local
stack frame
const(char)* c = getPtr2(s);
// c points to the storage of s, which is the local stack frame
return c;
}
So it's sad, but I am of the opinion that the only way to implement
small buffer optimization is to have a higher-level abstraction, a
distinct type for such small strings.
--
Michel Fortin
michel.fortin at michelf.com
http://michelf.com/
More information about the Digitalmars-d
mailing list