Proposal: clean up semantics of array literals vs string literals

Andrei Alexandrescu SeeWebsiteForEmail at erdani.org
Tue Oct 2 08:14:26 PDT 2012


On 10/2/12 7:11 AM, Don Clugston wrote:
> The problem
> -----------
>
> String literals in D are a little bit magical; they have a trailing \0.
[snip]

I don't mean to be Debbie Downer on this because I reckon it addresses 
an issue that some have, although I never do. With that warning, a few 
candid opinions follow.

First, I think zero-terminated strings shouldn't be needed frequently 
enough in D code to make this necessary.

Second, a simple and workable solution to this would be to address the 
matter dynamically: make toStringz opportunistically look whether 
there's a \0 beyond the end of the string, EXCEPT when the string 
happens to end exactly at a page boundary (in which case accessing 
memory beyond the end of the string may produce a page fault). With this 
simple dynamic test we don't need precise and stringent rules for the 
implementation.

Third, the complex set of rules proposed pushes the number of cases in 
which the \0 is guaranteed, but doesn't make for a clear and easy to 
remember boundary. Therefore people will need to remember some more 
rules to make sure they can, well, avoid a call to toStringz.

On 10/2/12 10:55 AM, Regan Heath wrote:
> Recent discussions on the zero terminated string problems and
> inconsistency of string literals has me, again, wondering why D
> doesn't have a 'type' to represent C's zero terminated strings.  It
> seems to me that having a type, and typing C functions with it would
> solve a lot of problems.
[snip]
> I am probably missing something obvious, or I have forgotten one of
> the array/slice complexities which makes this a nightmare.

You're not missing anything and defining a zero-terminated type is 
something I considered doing and have been highly interested in. My 
interest is motivated by the fact that sentinel-terminated structures 
are a very interesting example of forward ranges that are also 
contiguous. That sets them apart from both singly-linked lists and 
simple arrays, and gives them interesting properties.

I'd be interested in defining the more general:

struct SentinelTerminatedSlice(T, T terminator)
{
     private T* data;
     ...
}

That would be a forward range and the instantiation 
SentinelTerminatedSlice!(char, 0) would be CString.

However, so far I held off of defining such a range because C-strings 
are seldom useful in D code and there are not many other compelling 
examples of sentinel-terminated ranges. Maybe it's time to dust off that 
idea, I'd love it if we gathered enough motivation for it.


Andrei


More information about the Digitalmars-d mailing list