[draft] New std.regex walkthrough

Dmitry Olshansky dmitry.olsh at gmail.com
Tue Mar 13 14:03:34 PDT 2012


On 14.03.2012 0:32, Brad Anderson wrote:
> On Tue, Mar 13, 2012 at 1:27 PM, Dmitry Olshansky <dmitry.olsh at gmail.com
> <mailto:dmitry.olsh at gmail.com>> wrote:
>
>     For a couple of releases we have a new revamped std.regex, that as
>     far as I'm concerned works nicely, thanks to my GSOC commitment last
>     summer. Yet there was certain dark trend around std.regex/std.regexp
>     as both had severe bugs, missing documentation and what not, enough
>     to consider them unusable or dismiss prematurely.
>
>     It's about time to break this gloomy aura, and show that std.regex
>     is actually easy to use, that it does the thing and has some nice
>     extras.
>
>     Link: http://blackwhale.github.com/__regular-expression.html
>     <http://blackwhale.github.com/regular-expression.html>
>
>     Comments are welcome from experts and newbies alike, in fact it
>     should encourage people to try out a few tricks ;)
>
>     This is intended as replacement for an article on dlang.org
>     <http://dlang.org>
>     about outdated (and soon to disappear) std.regexp:
>     http://dlang.org/regular-__expression.html
>     <http://dlang.org/regular-expression.html>
>
>     [Spoiler] one example relies on a parser bug being fixed (blush):
>     https://github.com/D-__Programming-Language/phobos/__pull/481
>     <https://github.com/D-Programming-Language/phobos/pull/481>
>     Well, it was a specific lookahead inside lookaround so that's not
>     severe bug ;)
>
>     P.S. I've been following through a bunch of new bug reports
>     recently, thanks to everyone involved :)
>
>
>     --
>     Dmitry Olshansky
>
>
> Second paragraph:
> - "..,expressions, though one though one should..." has too many "though
> one"s
>
> Third paragraph:
> - "...keeping it's implementation..." should be "its"
> - "We'll see how close to built-ins one can get this way." was kind of
> confusing.  I'd consider just doing away with the distinction between
> built in and non-built in regex since it's an implementation detail most
> programmers who use it don't even need to know about.  Maybe say that it
> is not built in and explain why that is a neat thing to have (meaning,
> the language itself is powerful enough to express it in user code).
>
> Fourth paragraph:
> - "...article you'd have..." should probably be "you'll" or, preferably,
> "you will".
> - "...utilize it's API..." should be "its"
> - "yet it's not required to get an understanding of the API." I'd
> probably change this to "...yet it's not required to understand the API"
>
> Lost track of which paragraph:
> - "... that allows writing a regex pattern in it's natural notation"
> another "its"
> - "trying to match special characters like" I'd write "trying to match
> special regex characters like" for clarity
> - "over input like e.g. search or simillar" I'd remove the e.g., write
> search as "search()" to show it's a function in other languages and fix
> the spelling of similar :P
> - "An element type is Captures for the string type being used, it is a
> random access range." I just found this confusing.  Not sure what it's
> trying to say.
> - "I won't go into full detail of the range conception, suffice to say,"
> I'd change "conception" to "concept" and remove "suffice to say". (It's
> a shame we don't a range article we can link to).
> - "At that time ancors like" misspelled "anchors"
> - "Needless to say, one need not" I'd remove the "Needless to say,"
> because I think it's actually important to say :P
> - "replace(text, regex(r"([0-9]{1,2})/([0-9]{1,2})/([0-9]{4})","g"),
> "--");" Is this code example correct?  It references $1, $2, etc. in the
> explanatory paragraph below but they are no where to be found.
> - When you are explaining named captures it sounds like you are about to
> show them in the subsequent code example but you are actually showing
> what it'd look like without them which was a bit confusing.
> - Maybe some more words on what lookaround/lookahead do as I was lost.
> - "Amdittedly, barrage of ? and ! makes regex rather obscure, more then
> it's actually is. However" should be "Admittedly, the barrage of ? and !
> makes the regex rather obscure, more than it actually is.".  Maybe
> change "obscure" to a different adjective. Perhaps "complex looking" or
> "complicated". (note I've removed the "However" as the upcoming sentence
> isn't contradicting what you just said.
> - "Needless to say it's", again, I think it's rather important to say :P
> - "Run-time version took around 10-20us on my machine, admittedly no
> statistics." here, borrow this "µ" :P.  Also, I'd get rid of "admittedly
> no statistics".
> - "meaningful tasks, it's features" another "its"
> - "together it's major" and another :P
> - "...flexible tools: match, replace, spliter" should be spelled "splitter"
>

Wow, thanks a lot, that sure was a through read. I'll going to carefully 
work through this list tomorrow.

>
> Great article.  I didn't even know about the replacement delegate
> feature which is something I've often wished I could use in other regex
> systems.  D and Phobos need more articles like this.  We should have a
> link to it from the std.regex documentation once this is added to the
> website.
>
> Regards,
> Brad Anderson


-- 
Dmitry Olshansky


More information about the Digitalmars-d mailing list