dereferencing null

Chad J chadjoan at __spam.is.bad__gmail.com
Wed Mar 7 17:44:59 PST 2012


On 03/07/2012 10:21 AM, Steven Schveighoffer wrote:
> On Wed, 07 Mar 2012 10:10:32 -0500, Chad J
> <chadjoan at __spam.is.bad__gmail.com> wrote:
>
>> On Wednesday, 7 March 2012 at 14:23:18 UTC, Chad J wrote:
>>
>> I spoke too soon!
>> We missed one:
>>
>> 1. You forgot to initialize a variable.
>> 2. Your memory has been corrupted, and some corrupted pointer
>> now points into no-mem land.
>> 3. You are accessing memory that has been deallocated.
>> 4. null was being used as a sentinal value, and it snuck into
>> a place where the value should not be a sentinal anymore.
>>
>> I will now change what I said to reflect this:
>>
>> I think I see where the misunderstanding is coming from.
>>
>> I encounter (1) from time to time. It isn't a huge problem because
>> usually if I declare something the next thing on my mind is
>> initializing it. Even if I forget, I'll catch it in early testing. It
>> tends to never make it to anyone else's desk, unless it's a
>> regression. Regressions like this aren't terribly common though. If
>> you make my program crash from (1), I'll live.
>>
>> I didn't even consider (2) and (3) as possibilities. Those are far
>> from my mind. I think I'm used to VM languages at this point (C#,
>> Java, Actionscript 3, Haxe, Synergy/DE|DBL, etc). In the VM, (2) and
>> (3) can't happen. I never worry about those. Feel free to crash these
>> in D.
>>
>> I encounter (4) a lot. I really don't want my programs crashed when
>> (4) happens. Such crashes would be super annoying, and they can happen
>> at very bad times.
>
> You can use sentinels other than null.
>
> -Steve

Example?

Here, if you want, I'll start with a typical case.  Please make it right.

class UnreliableResource
{
	this(string sourceFile) {...}
	this(uint userId) {...}
	void doTheThing() {...}
}

void main()
{
	// Set this to a sentinal value for cases where the source does
	//   not exist, thus preventing proper initialization of res.
	UnreliableResource res = null;

	// The point here is that obtaining this unreliable resource
	//   is tricky business, and therefore complicated.
	//
	if ( std.file.exists("some_special_file") )
	{
		res = new UnreliableResource("some_special_file");
	}
	else
	{
		uint uid = getUserIdSomehow();
		if ( isValidUserId(uid) )
		{
			res = new UnreliableResource(uid);
		}
	}

	// Do some other stuff.
	...
	
	// Now use the resource.
	try
	{
		thisCouldBreakButItWont(res);
	}
	// Fairly safe if we were in a reasonable VM.
	catch ( NullDerefException e )
	{
		writefln("This shouldn't happen, but it did.");
	}
}

void thisCouldBreakButItWont(UnreliableResource res)
{
	if ( res != null )
	{
		res.doTheThing();
	}
	else
	{
		doSomethingUsefulThatCanHappenWhenResIsNotAvailable();
		writefln("Couldn't find the resource thingy.");
		writefln("Resetting the m-rotor.  (NOOoooo!)");
	}
}

Please follow these constraints:

- Do not use a separate boolean variable for determining whether or not 
'res' could be created.  This violates a kind of SSOT 
(http://en.wikipedia.org/wiki/Single_Source_of_Truth) because it allows 
cases where the hypothetical "resIsInitialized" variable is true but res 
isn't actually initialized, or where "resIsInitialized" is false but res 
is actually initialized.  It also doesn't throw catchable exceptions 
when the uninitialized class has methods called on it.  In my pansy 
VM-based languages I always prefer to risk the null sentinal.

- Do not modify the implementation of UnreliableResource.  It's not 
always possible.

- Try to make the solution something that could, in principle, be placed 
into Phobos and reused without a lot of refactoring in the original code.

...

Now I will think about this a bit...

This reminds me a lot of algebraic data types.  I kind of want to say 
something like:
auto res = empty | UnreliableResource;

and then unwrap it:

	...
	thisCantBreakAnymore(res);
}

void thisCantBreakAnymore(UnreliableResource res)
{
	res.doTheThing();
}

void thisCantBreakAnymore(empty)
{
	doSomethingUsefulThatCanHappenWhenResIsNotAvailable();
	writefln("Couldn't find the resource thingy.");
	writefln("Resetting the m-rotor.  (NOOoooo!)");
}


I'm not absolutely sure I'd want to go that path though, and since D is 
unlikely to do any of those things, I just want to be able to catch an 
exception if the sentinel value tries to have the "doTheThing()" method 
called on it.

I can maybe see invariants being used for this:

class UnreliableResource
{
	bool initialized = false;

	invariant
	{
		if (!initialized)
			throw new Exception("Not initialized.");
	}

	void initialize(string sourceFile)
	{
		...
	}

	void initialize(uint userId)
	{
		...
	}

	void doTheThing() {...}
}

But as I think about it, this approach already has a lot of problems:

- It violates the condition that UnreliableResource shouldn't be 
modified to solve the problem.  Sometimes the class in question is 
upstream or otherwise not available for modification.

- I have to add this stupid boilerplate to every class.

- There could be a mixin template to ease the boilerplate, but the D 
spec states that there can be only one invariant in a class.  Using such 
a mixin would nix my ability to have an invariant for other things.

- Calling initialize(...) would violate the invariant.  It can't be 
initialized in the constructor because we need to be able to have the 
instance exist temporarily in a state where it is constructed from a 
nullary do-nothing constructor and remains uninitialized until a 
beneficial codepath initializes it properly.

- It will not be present in release mode.  This could be a deal-breaker 
in some cases.

- Using this means that instances of UnreliableResource should just 
never be null, and thus I am required to do an allocation even when the 
program will take codepaths that don't actually use the class.  I'm 
usually not concerned too much with premature optimization, but 
allocations are probably a nasty thing to sprinkle about unnecessarily.



Maybe a proxy struct with opDispatch and such could be used to get 
around these limitations?
Ex usage: Initializable!(UnreliableResource) res;


More information about the Digitalmars-d mailing list