Another prayer for invariant strngs

Regan Heath regan at netmail.co.nz
Fri Jul 13 01:25:56 PDT 2007


(disclaimer, I have done only the testing shown at the end of this post)

Robert Fraser wrote:
> Invariant strings have been discussed before (briefly) in discussions
> of constness, however I wish to bring up the subject again more
> directly.
> 
> The "string" alias as it is now (in D 2.0) is an odd beast. The
> problem is that it is invariant(char)[] instead of invariant(char[]),
> so that while the characters themselves are invariant, the array is
> mutable. 

This makes sense if you think about it from the compilers point of view.

It has placed the characters themselves in ROM but the array reference 
is in RAM so it's pointer and length can change.  So, this is valid:

invariant(char)[] a = "foo";
invariant(char)[] b = "bar";
b = a;

But these are invalid:

char[] p;

a[0] = 'a'; //for any given rvalue
b[] = a[];  //and slicing variants
p = a;      //p cannot point to invariant(char)

If you want to prevent the reference from changing make it 'final', eg.

final invariant(char)[] a;

 > This has two main problems:
> 
> 1. It's confusing. There have been quite a few topics both in this
> newsgroup and in digitalmars.D.learn about how exactly to use the 2.0
> string alias and where it's immutable/where it's not.

I wont argue as to whether it's confusing, but to me it seems the basic 
concept is:  "A 'string' reference isn't immutable (or rather 'final'), 
but it's data is (immutable)".

> 2. Performance. While writing my own code, I can pretend "string" is
> invariant (or use my own invariant(char[]) alias), but when passing
> to, or receiving code from library functions, this is not possible.

When you pass string to a function that function gets a /copy/ of the 
reference.  So, there is technically no need for the copied reference to 
be invariant (or rather 'final').  Changes to the reference in the 
function *do not* propagate back to the caller.

Unless, however, the parameter is 'ref'.  In which case changes to the 
reference propagate back to the caller.  In this case if your reference 
is final DMD will error, see test case below.

In short, if you use 'final' on your strings then even if you call a 
library function which takes a 'ref' the compiler will protect you.

> This means that in each of these situations I must take two,
> performance-draining precautionary measures: i. Duplicate the string
> every time it's passed in or out of my code. ii.Synchronize
> multithreaded access to strings/acquire locks/etc.

You do not need to sync access to invariant data, but you may need to 
sync access to an array reference (if your code, or library code might 
change it).  To prevent changes make your strings final.

> Invariant strings have precedent: they're used in Java, .NET, Perl,
> Python, Ruby and quite a few other languages. And for when multiple
> string operations are going down, there's always char[] and .idup to
> fall back on, which are far better than Java's StringBuffer, etc.

Does Java prevent you re-assigning an invariant string reference?  If 
so, are they implicitly 'final' then?

> So, please, Walter... consider Andrei's proposal and make "string" an
> alias to invariant(char[]). It'll make a lot of happiness happen.

I think a greater understanding of the current system is required before 
we start opting for changes.

  - Regan Heath

Test cases:

void main()
{
	invariant(char)[] p1 = "one";
	invariant(char[]) p2 = "two";
	final invariant(char[]) p3 = "three";
	char[] p4 = "four".dup;
	const(char)[] p5 = "five";
	const(char[]) p6 = "six";

	//p1[0] = 'a'; //Error: p1[0] is not mutable
	//p2[0] = 'a'; //Error: p2[0] is not mutable
	//p3[0] = 'a'; //Error: p3[0] is not mutable
	p4[0] = 'a'; //ok
	//p5[0] = 'a'; //Error: p5[0] is not mutable
	//p6[0] = 'a'; //Error: p6[0] is not mutable
	
	//p1[] = p2[]; //Error: slice p1[] is not mutable
	//p2[] = p1[]; //Error: slice p2[] is not mutable
	//p3[] = p1[]; //Error: slice p3[] is not mutable
	p4[] = p1[]; //ok
	//p5[] = p1[]; //Error: slice p5[] is not mutable
	//p6[] = p1[]; //Error: slice p6[] is not mutable
	
	p1 = p2; //ok
	p2 = p1; //ok	
	//p3 = p1; //variable invariant.p3 cannot modify final/const/invariant 
variable 'p3'
	//p4 = p1;  //Error: cannot implicitly convert expression (p1) of type 
invariant(char)[] to char[]
	p5 = p1; //ok
	p6 = p1; //ok

	foo(p3); //variable invariant.main.p3 cannot modify 
final/const/invariant variable 'p3'
}

/*
void foo(final invariant(char)[] param)
{
	//param = "test";  //variable invariant.foo.param cannot modify 
final/const/invariant variable 'param'
}
*/

void foo(ref invariant(char)[] param)
{
	param = "test";  //variable invariant.foo.param cannot modify 
final/const/invariant variable 'param'
}



More information about the Digitalmars-d mailing list