earthquake changes of std.regexp to come

Bill Baxter wbaxter at gmail.com
Tue Feb 17 14:15:27 PST 2009


On Wed, Feb 18, 2009 at 6:56 AM, Andrei Alexandrescu
<SeeWebsiteForEmail at erdani.org> wrote:
> Bill Baxter wrote:
>>
>> On Wed, Feb 18, 2009 at 3:36 AM, Andrei Alexandrescu
>> <SeeWebsiteForEmail at erdani.org> wrote:
>>
>>> Besides std.regexp only works with (narrow) strings and we want it to
>>> work
>>> on streams of all widths and structures. One pet complaint I have is that
>>> std.regexp puts a class around it all as if everybody's favorite pastime
>>> would be to inherit Regexp and override some random function in it.
>>
>> So what do you think it should be, a struct?
>
> Yes.
>
>> That would imply to me that everybody's favorite pastime is making
>> value copies of regex structures, when in fact nobody does that.
>
> Well you'd be surprised. The RegEx class saves the state of the last search,
> which is a sensible thing to do. But then consider a simple range Splitter
> that, when iterated, nicely gives you...
>
> string a = ",a,  bcd, def,gh,";
> foreach (e; splitter(a, pattern(", *"))
>    writeln("[", e, "]");
>
> writes
>
> []
> [a]
> [bcd]
> [def]
> [gh]
>
> This is similar to the function std.regex.split with the notable difference
> that no extra memory is allocated. Now Splitter is an input range. This
> means you wouldn't expect that you copy a Splitter and then have iterating
> the original value affect the copy. Well, that's exactly what happens when
> you use the "good" reference semantics of the RegEx stored inside splitter.
> Worse, RegExp has no cloning primitive, so I need to resort to storing the
> pattern and recompiling it from scratch at every copy of Splitter. So
> essentially the "good" semantics of RegEx are useless when it comes to
> composing it in larger objects.

So that sounds to me like RegEx should have a .dup, and then it would
be fine, no?  I agree it should have a dup for the odd occasion when
you do want to make a copy for some reason.

>> Regex is a class in order to give it reference semantics and provide
>> encapsulation of some re-usable state.  Maybe it should be a final
>> class, but my impression is "final class" doesn't really work in D.

> Re-usable state is provided by structs too. In addition they can choose
> value vs. reference semantics with ease.

I think this choice is not so much available with D1, plus the
constructor situation with D1 is less than ideal.  Given that, I think
the choice of class for RegEx was apropriate.   But if the struct
problems are all going away in D2, then that's great.  Sounds like
you're saying we'll really be able to use D structs just like one uses
a non-polymorphic C++ class.  If so, then that's super.

--bb



More information about the Digitalmars-d mailing list