DConf 2014 Day 1 Talk 4: Inside the Regular Expressions in D by Dmitry Olshansky

Dicebot via Digitalmars-d-announce digitalmars-d-announce at puremagic.com
Mon Jun 16 14:51:37 PDT 2014


On Sunday, 15 June 2014 at 21:38:18 UTC, Dmitry Olshansky wrote:
> 15-Jun-2014 20:21, Dicebot пишет:
>> On Saturday, 14 June 2014 at 16:34:35 UTC, Dmitry Olshansky 
>> wrote:
>>> But let's face it - it's a one-time job to get it right in 
>>> your
>>> favorite build tool. Then you have fast and cached (re)build.
>>> Comparatively costs of CTFE generation are paid in full 
>>> during _each_
>>> build.
>>
>> There is no such thing as one-time job in programming unless 
>> you work
>> alone and abandon any long-term maintenance. As time goes any 
>> mistake
>> that can possibly happen will inevitably happen.
>
> The frequency of such event is orders of magnitude smaller. 
> Let's not take arguments to supreme as then doing anything is 
> futile due to the potential of mistake it introduces sooner or 
> later.

It is more likely to happen if you change you build scripts more 
often. And this is exactly what you propose.

I am not going to say it is impractical, just mentioning flaws 
that make me seek for better solution.

> Automation. Dumping the body of ctRegex is manual work after 
> all, including putting it with the right symbol. In proposed 
> scheme it's just a matter of copy-pasting a pattern after 
> initial setup has been done.

I think defining regexes in separate module is even less of an 
effort than adding few lines to the build script ;)

>> It is somewhat worse because you don't routinely change 
>> external
>> libraries, as opposed to local sources.
>>
>
> But surely we have libraries that are built as separate project 
> and are "external" dependencies, right? There is nothing new 
> here except that "d-->obj-->lib file" is changed to 
> "generator-->generated D file--->obj file".

Ok, I am probably convinced on this one. Incidentally I do always 
prefer full source builds as opposed to static library separation 
inside application itself. When there is enough RAM for dmd of 
course :)

>>>> Huge mess to maintain. According to my experience
>> dub is terrible at defining any complicated build models. 
>> Pretty much
>> anything that is not single step compile-them-all approach can 
>> only be
>> done via calling external shell script.
>
> I'm not going to like dub then ;)

It is primarily source dependency manager, not a build tool. I 
remember Sonke mentioning it is intentionally kept simplistic to 
guarantee no platform-unique features are ever needed.

For anything complicated I'd probably wrap dub call inside 
makefile to prepare all necessary extra files.

>> If using external generators is
>> necessary I will take make over anything else :)
>
> Then I understand your point about inevitable mistakes, it's 
> all in the tool.

make is actually pretty good if you don't care about other 
platforms than Linux. Well, other than stupid whitespace 
sensitivity. But it is incredibly good at defining build systems 
with chained dependencies.

> What I want to point out is to not mistake goals and the means 
> to an end. No matter how we call it CTFE code generation is 
> just a means to an end, with serious limitations (especially as 
> it stands today, in the real world).

I agree. What I do disagree about is definition of the goal. It 
is not just "generating code", it is "generating code in a manner 
understood by compiler".

> For instance if D compiler allowed external tools as plugins 
> (just an example to show means vs ends distinction) with some 
> form of the following construct:
>
> mixin(call_external_tool("args", 3, 14, 15, .92));
>
> it would make any generation totally practical *today*.

But this is exactly the case when language integration gives you 
nothing over build system solution :) If compiler itself is not 
aware how code gets generated from arguments, there is no real 
advantage in putting tool invocation inline.

> How long till C preprocessor is working at CTFE? How long till 
> it's practical to do:
>
> mixin(htod(import("some_header.h")));
>
> and have it done optimally fast at CTFE?

Never, but it is not really about being fast or convenient. For 
htod you don't want just C grammar / preprocessor support, you 
want it as good as one in real C compilers.

> My answer is - no amount of JITing CTFE and compiler 
> architecture improvements in foreseeable future will get it 
> better then standalone tool(s), due to the mentioned 
> _fundamental_ limitations.
>
> There are real practical boundaries on where an internal 
> interpreter can stay competitive.

I don't see any fundamental practical boundaries. Quality of 
implementation ones - sure. Quite the contrary, I totally see how 
better compiler can easily outperform any external tools for most 
build tasks despite somewhat worse JIT codegen - it has huge 
advantage of being able to work on language semantical entities 
and not just files. That allows much smarter caching and 
dependency tracking, something external tools will never be able 
to achieve.


More information about the Digitalmars-d-announce mailing list