Migrating dmd to D?

Daniel Murphy yebblies at nospamgmail.com
Sat Mar 2 23:27:50 PST 2013


"Zach the Mystic" <reachBUTMINUSTHISzach at gOOGLYmail.com> wrote in message 
news:ewtgqpcvhmlaaibiaezc at forum.dlang.org...
>
> Since you're obviously way ahead of me on this, I'm going to go ahead and 
> say everything I've been thinking about this issue.
>
> My approach to translating the source would be more-or-less naive. That 
> is, I would be trying to do simple pattern-matching and replacement as 
> much as possible. I would try to go as far as I could without the scanner 
> knowing any context-sensitive information. When I added a piece of 
> context-sensitive information, I would do so by observing the failures of 
> the naive output, and adding pieces one by one, searching for the most 
> bang for my context-sensitive buck. It would be nice to see upwards of 50 
> percent or more of the code conquered by just a few such carefully 
> selected context-sensitive bucks.
>
> Eventually the point of diminishing returns would be met with these simple 
> additions. It would be of utility to have a language at that point, which, 
> instead of seeking direct gains in its ability to transform dmd code, saw 
> its gains in the ease and flexibility with which one could add the 
> increasingly obscure and detailed special cases to it. I don't know how to 
> set up that language or its data structures, but I can tell you what I'd 
> like to be able to do with it.
>
> I would like to be able to query which function I am in, which class I am 
> assembling, etc. I would like to be able to take a given piece of text and 
> say exactly what text should replace it, so that complex macros could be 
> rewritten to their equivalent static pure D functions. In other words, 
> when push comes to shove, I want to be able to brute-force a particularly 
> hard substitution direct access to the context-sensitive data structure. 
> For example, suppose I know that some strange macro peculiarities of a 
> function add an extra '}' brace which is not read by C++ but is picked up 
> by the naive nesting '{}' tracker, which botches up its 'nestedBraceLevel' 
> variable. It would be necessary to be able to say:
>
> if (currentFunction == "oneIKnowToBeMessedUp" &&
>    currentLine >= funcList.oneIKnowToBeMessedUp.startingLine +50)
>    { --nestedBraceLevel; }
>
> My founding principle is Keep It Simple Stupid. I don't know if it's the 
> best way to start, but barring expert advice steering me away from it, it 
> would be the best for someone like me who had no experience and needed to 
> learn from the ground up what worked and what didn't.
>
> Another advantage of the domain-specific language as described above would 
> its reusability of whatever transformations are common in C++, say 
> transforming 'strcmp(a,b)' -> 'a == b', and it's possible use for adding 
> special cases to translating from one language to another generally 
> speaking . I don't know the difference between what I'm describing and a 
> basic macro text processing language - they might be the same.
>
> My last thought is probably well-tread ground, but the translation program 
> should have import dependency charts for its target program, and automate 
> imports on a per-symbol basis, so it lays out the total file in two steps.
>
> import std.array : front, array;
>
> One thing I'm specifically avoiding in this proposal is a sophisticated 
> awareness of the C++ grammar. I'm hoping special cases cover whatever 
> ground might be more perfectly trod by a totally grammar-aware conversion 
> mechanism.
>
> Now you're as up-to-date as I am on what I'm thinking.

I did something like that before (token-level pattern matching) and found 
the number of special cases to be much much too high.  You need so much 
context information you're better off just building an ast and operating on 
that.

For the nastier special cases, I'm modifying the compiler source to 
eliminate them.  This mostly means expanding macros and adding casts.

Many of the same ideas apply, although I'm not trying to eg use native 
arrays and strings, just a direct port. 




More information about the Digitalmars-d mailing list