Migrating dmd to D?
Zach the Mystic
reachBUTMINUSTHISzach at gOOGLYmail.com
Sat Mar 2 21:48:02 PST 2013
On Sunday, 3 March 2013 at 03:06:15 UTC, Daniel Murphy wrote:
>> Every single one of these would have to be special-cased. If
>> you had a domain-specific language you could keep track of
>> whether you were mid-declaration, mid-statement, or
>> mid-string-literal. Half the stuff you special-case could
>> probably be applied to other C++ projects as well.
>>
>> If this works, the benefits are just enormous. In fact, I
>> would actually like to "waste" my time trying to make this
>> work, but I'm going to need to ask a lot of questions because
>> my current programming skills are nowhere near the average
>> level of posters at this forum.
>>
>> I would like a c++ lexer (with whitespace) to start with. Then
>> a discussion of parsers and emitters. Then a ton of questions
>> just on learning github and other basics.
>>
>> I would also like the sanction of some of the more experienced
>> people here, saying it's at least worth a go, even if other
>> strategies are simultaneously pursued.
>
> Something like this https://github.com/yebblies/magicport2 ?
Since you're obviously way ahead of me on this, I'm going to go
ahead and say everything I've been thinking about this issue.
My approach to translating the source would be more-or-less
naive. That is, I would be trying to do simple pattern-matching
and replacement as much as possible. I would try to go as far as
I could without the scanner knowing any context-sensitive
information. When I added a piece of context-sensitive
information, I would do so by observing the failures of the naive
output, and adding pieces one by one, searching for the most bang
for my context-sensitive buck. It would be nice to see upwards of
50 percent or more of the code conquered by just a few such
carefully selected context-sensitive bucks.
Eventually the point of diminishing returns would be met with
these simple additions. It would be of utility to have a language
at that point, which, instead of seeking direct gains in its
ability to transform dmd code, saw its gains in the ease and
flexibility with which one could add the increasingly obscure and
detailed special cases to it. I don't know how to set up that
language or its data structures, but I can tell you what I'd like
to be able to do with it.
I would like to be able to query which function I am in, which
class I am assembling, etc. I would like to be able to take a
given piece of text and say exactly what text should replace it,
so that complex macros could be rewritten to their equivalent
static pure D functions. In other words, when push comes to
shove, I want to be able to brute-force a particularly hard
substitution direct access to the context-sensitive data
structure. For example, suppose I know that some strange macro
peculiarities of a function add an extra '}' brace which is not
read by C++ but is picked up by the naive nesting '{}' tracker,
which botches up its 'nestedBraceLevel' variable. It would be
necessary to be able to say:
if (currentFunction == "oneIKnowToBeMessedUp" &&
currentLine >= funcList.oneIKnowToBeMessedUp.startingLine +50)
{ --nestedBraceLevel; }
My founding principle is Keep It Simple Stupid. I don't know if
it's the best way to start, but barring expert advice steering me
away from it, it would be the best for someone like me who had no
experience and needed to learn from the ground up what worked and
what didn't.
Another advantage of the domain-specific language as described
above would its reusability of whatever transformations are
common in C++, say transforming 'strcmp(a,b)' -> 'a == b', and
it's possible use for adding special cases to translating from
one language to another generally speaking . I don't know the
difference between what I'm describing and a basic macro text
processing language - they might be the same.
My last thought is probably well-tread ground, but the
translation program should have import dependency charts for its
target program, and automate imports on a per-symbol basis, so it
lays out the total file in two steps.
import std.array : front, array;
One thing I'm specifically avoiding in this proposal is a
sophisticated awareness of the C++ grammar. I'm hoping special
cases cover whatever ground might be more perfectly trod by a
totally grammar-aware conversion mechanism.
Now you're as up-to-date as I am on what I'm thinking.
More information about the Digitalmars-d
mailing list