regexex, enforce and purity

Sun Sep 9 22:42:03 PDT 2012

On 09-Sep-12 23:04, monarch_dodra wrote:
 > Given this little program testing regexs, I decided to replace one of
 > the example's assert with an enforce:
 >
 > --------
 > import std.regex;
 > import std.exception;
 > void main()
 > {
 >      auto m = match("hello world", regex("world"));
 >      assert(m);
 >      enforce(m); // <-- HERE
 >      enforce(cast(bool)m);
 >      enforce(!m.empty);
 > }
 > --------
 >
 > I get the compile errors:
 > src\phobos\std\exception.d(356): Error: pure function 'enforce' cannot
 > call impure function '~this'
 > src\phobos\std\exception.d(358): Error: pure function 'enforce' cannot
 > call impure function 'opCast'
 >
 > While I understand the problem at play, I have a few doubts:
 > 1)Why the difference between assert and enforce? Shouldn't both have the
 > same restraints?

Nope, assert is built-in and thus is enigma ;)

 > 2)What exactly does purity mean for a *member* function?
That 'this' parameter is implicitly passed, then everything else is 
similar to the usual free function.

 > 3)And shouldn't RegexMatch's .opCast (and .empty) should be qualified as
 > pure?

I've no idea. Pure/nothrow zealots might have made enforce pure but it 
really shouldn't always be. In fact it should be template and rely on 
deduction, if it is already then it's the deduction that is broken.

Also I see that enforce tries to copy RegexMatch object, this involves 
destructor and that's can't be pure - it's ref-counted entity around 
C-heap memory chunk.
Last time I checked destructor & postblits were mostly broken w.r.t. 
pure/safe/immutable etc.

 > ...
 >
 > I did some digging while typing, and was about to suggest that the
 > problem could be solved if enforce was required to take a boolean as an
 > argument (makes sense), forcing the cast *outside* of the enforce.
 > However, it would appear that enforce returns its value, the goal
 > (probably) being to make this legal:
 >

It's a case of some tricky and cool stuff that sometimes isn't. The idea 
of enforce was that passed in object is tested with if using whatever 
implicit conversion possible and returns it as is if it passes the 'if 
test'.

The trick is that it enables some convenient things:

auto f = enforce(fopen("blah", "r"));

Now in your case probably this will work better:
m = enforce(move(m));

as it technically shouldn't call m's destructor.
enforce is specifically geared toward r-values & RVO optimization.

 > auto bar = enforce(foo());
 >
 > The return value is enforced and passed to bar in a 1-liner.
 >
 > BUT... assert doesn't do that. THAT is the original source of the
 > difference in behavior.

They are different on so many levels and do serve different needs.
More then that it's library artifact vs built-in statement.

 > So I'll rephrase my 1):
 > Why the difference in behavior regarding the return value? Is it just
 > historical/no real reason, or is there something for me to learn here?

Aside from the fact that assert shouldn't affect control flow in any 
way, thus:
m = assert(m); //wouldn't make any sense as assert gets stripped in 
release builds

While enforce is a convenient way to check some inputs and possible 
states and throw if they are not good.

-- 
Dmitry Olshansky