dstring - was Re: next version of DWT?
Chris Miller
chris at dprogramming.com
Sun May 13 12:44:22 PDT 2007
I forgot to reply to this; comments embedded...
On Sun, 06 May 2007 05:36:23 -0400, Marcin Kuszczak <aarti at interia.pl>
wrote:
> Chris Miller wrote:
>> Is this limitation really a problem for some of your code? I thought it
>> was still big enough, with room to spare. I don't recall ever having a
>> string of over 1_073_741_824 characters. Also, for a 64-bit program, the
>> limit is raised considerably, to thousands of terabytes (and
>> string.MAX_LENGTH will reflect this automatically).
>
>
> I did not get to this problem, mainly because I didn't used it. And I did
> not used it because your work still doesn't meet my criteria for
> something
> what I would use in my development:
> 1. Because there will be *for sure* people who will get to this limit. If
> something can happen it will happen. And then you are in trouble, because
> you can no more easily interchange between your string class and d
> character
> arrays, and your are in the start point again... In fact when assigning
> *any* char[] variable to your string, I should first check if it will fit
> into it...
Well, I just wasn't sure. I'm still wondering what others think about this
limitation. I wrote dstring mainly to see how it would go.
A billion characters seems plenty to me; and this is just for 32-bit
binaries. I could be wrong. I also figured those who need incredibly large
strings will probably want to write special-purpose string handling code
anyway, and it would seem odd that they would pass such large strings to
functions that don't expect them to be so large (e.g. std.string.replace
on a 1.5 gig string? yikes).
You don't need to check if it fits because it does that for you and throws
an exception.
> 2. I want string to do more than just normal character arrays, and I
> shouldn't accept something what is in some areas better, but in some
> worse.
> Higher abstraction has usually drawbacks - it needs more processing power
> and/or more memory. But I accept it as I need higher abstraction...
> 3. Allocating one additional byte in your struct probably will not be a
> big
> deal for anyone...And it shouldn't break anything, should it?
8 bytes, nicely aligned struct, vs. 9 bytes? or maybe 12 bytes? It was
designed to be easy to pass to functions and pack into other structures,
like char[]. Adding to it will kill these benefits, especially the ability
to return into registers.
> 4. There is still problem with optimization of memory consumption when
> adding dchar to string containing char[]. 4 times bigger memory
> consumption
> than original char[] is too much for me. I think that your string struct
> should be default optimize for lower memory consumption, and has static
> fields (methods) to set policy for speed.
Any dchar added to it doesn't do it; it will only if it can't fit into a
single char or wchar. To get to dchar requires characters outside the BMP
even, which can be quite rare.
I believe Python is going to be using "dchar" for any Unicode strings
beyond ASCII. I think dstring's way at least saves more than this.
> 5. It's not standard (not included in Phobos nor in Tango)
Agreed; I don't even use dstring at the moment.
More information about the Digitalmars-d-dwt
mailing list