dstring - was Re: next version of DWT?
chris at dprogramming.com
Sun May 13 12:44:22 PDT 2007
I forgot to reply to this; comments embedded...
On Sun, 06 May 2007 05:36:23 -0400, Marcin Kuszczak <aarti at interia.pl>
> Chris Miller wrote:
>> Is this limitation really a problem for some of your code? I thought it
>> was still big enough, with room to spare. I don't recall ever having a
>> string of over 1_073_741_824 characters. Also, for a 64-bit program, the
>> limit is raised considerably, to thousands of terabytes (and
>> string.MAX_LENGTH will reflect this automatically).
> I did not get to this problem, mainly because I didn't used it. And I did
> not used it because your work still doesn't meet my criteria for
> what I would use in my development:
> 1. Because there will be *for sure* people who will get to this limit. If
> something can happen it will happen. And then you are in trouble, because
> you can no more easily interchange between your string class and d
> arrays, and your are in the start point again... In fact when assigning
> *any* char variable to your string, I should first check if it will fit
> into it...
Well, I just wasn't sure. I'm still wondering what others think about this
limitation. I wrote dstring mainly to see how it would go.
A billion characters seems plenty to me; and this is just for 32-bit
binaries. I could be wrong. I also figured those who need incredibly large
strings will probably want to write special-purpose string handling code
anyway, and it would seem odd that they would pass such large strings to
functions that don't expect them to be so large (e.g. std.string.replace
on a 1.5 gig string? yikes).
You don't need to check if it fits because it does that for you and throws
> 2. I want string to do more than just normal character arrays, and I
> shouldn't accept something what is in some areas better, but in some
> Higher abstraction has usually drawbacks - it needs more processing power
> and/or more memory. But I accept it as I need higher abstraction...
> 3. Allocating one additional byte in your struct probably will not be a
> deal for anyone...And it shouldn't break anything, should it?
8 bytes, nicely aligned struct, vs. 9 bytes? or maybe 12 bytes? It was
designed to be easy to pass to functions and pack into other structures,
like char. Adding to it will kill these benefits, especially the ability
to return into registers.
> 4. There is still problem with optimization of memory consumption when
> adding dchar to string containing char. 4 times bigger memory
> than original char is too much for me. I think that your string struct
> should be default optimize for lower memory consumption, and has static
> fields (methods) to set policy for speed.
Any dchar added to it doesn't do it; it will only if it can't fit into a
single char or wchar. To get to dchar requires characters outside the BMP
even, which can be quite rare.
I believe Python is going to be using "dchar" for any Unicode strings
beyond ASCII. I think dstring's way at least saves more than this.
> 5. It's not standard (not included in Phobos nor in Tango)
Agreed; I don't even use dstring at the moment.
More information about the Digitalmars-d-dwt