Why Strings as Classes?

Mon Aug 25 18:07:46 PDT 2008

superdan wrote:
> Benji Smith Wrote:
> 
>> BCS wrote:
>>> Ditto, D is a *systems language* It's *supposed* to have access to the 
>>> lowest level representation and build stuff on top of that
>> But in this "systems language", it's a O(n) operation to get the nth 
>> character from a string, to slice a string based on character offsets, 
>> or to determine the number of characters in the string.
>>
>> I'd gladly pay the price of a single interface vtable lookup to turn all 
>> of those into O(1) operations.
> 
> dood. i dunno where to start. allow me to answer from multiple angles.
> 
> 1. when was the last time looking up one char in a string or computing length was your bottleneck.
> 
> 2. you talk as if o(1) happens by magic that d currently disallows.
> 
> 3. maybe i don't want to blow the size of my string by a factor of 4 if i'm just interested in some occasional character search.
> 
> 4. implement all that nice stuff you wanna. nobody put a gun to yer head not to. understand you can't put a gun to my head to pay the price.

Geez, man, you just keep missing the point, over and over again.

Let me make one point, blisteringly clear: I don't give a shit about the 
   data format. You want the fastest strings in the universe, 
implemented with zero-byte magic beans and burned into the local ROM. 
Fantastic! I'm completely in favor of it.

Presumably. people will be so into those strings that they'll write a 
shitload of functionality for them. Parsing, searching, sorting, 
indexing... the motherload.

One day, I come along, and I'd like to perform some text processing. But 
all of my string data comes from non-magic-beans data sources. I'd like 
to implement a new kind of string class that supports my data. I'm not 
going to push my super-slow string class on anybody else, because I know 
how concerned with performance you are.

But check this out... you can have your fast class, and I can have my 
slow class, and they can both implement the same interface. Like this:

interface CharSequence {
   int find(CharSequence needle);
   int rfind(CharSequence needle);
   // ...
}

class ZeroByteFastMagicString : CharSequence {
   // ...
}

class SuperSlowStoneTabletString : CharSequence {
   // ...
}

Now we can both use the same string functions. Just by implementing an 
interface, I can use the same text-processing as your 
hyper-compiler-optimized builtin arrays.

But only if the interface exists.

And only if library authors write their text-processing code against 
that interface.

That's the point.

A good API allows multiple implementations to make use of the same 
algorithms. Application authors can choose their own tradeoffs between 
speed, memory consumption, and functionality.

A rigid builtin implementation, with no interface definition, locks 
everybody into the same choices.

--benji