Port of Python's difflib.SequenceMatcher class
Michael Butscher
mbutscher at gmx.de
Wed Dec 6 14:21:35 PST 2006
Pragma wrote:
> Michael Butscher wrote:
> > Hi,
> >
> > a D port (version 0.175) of Python's difflib.SequenceMatcher class to
> > generate diff's is available at
> >
> > http://www.mbutscher.de/snippets/difflib_d20061202.zip
> >
> > It might need some cleaning up yet but the translated doctests pass
> > (except one I couldn't make compile in D, but "in theory" it passes as
> > well).
> >
> > Comments, critique?
>
> I agree with Walter that you should throw this up on a page somewhere.
At least I have mentioned it on the page
http://www.mbutscher.de/software.html
as a "snippet" (it isn't much more, I think).
> I'm curious, but rarely have time to sift through sourcecode unless I'm
> in need of something specific - I develop using SVN 99% of the time,
> which does .diff output for me already.
I will need it later for a project written in Python (kind of personal
wiki without server) to allow to store different versions of a wiki
page.
When the time comes, I will add a little C interface for a DLL which
mainly can create some sort of binary diff of two arbitrary byte-blocks
and allows to apply the diff to the first block to create the second.
> But I *am* curious about how the porting went, what the pitfalls were,
> and how you worked around Python idioms and tuple types.
- The often used "self" was just translated to "this" therefore the
code looks a bit weird in D, e.g.:
void set_seq2(ST b)
{
if (b is this.b)
return;
this.b = b;
this.matching_blocks = null;
this.opcodes = null;
this.fullbcount = null;
this.chain_b();
}
- One thing I really missed in D was the get() method for Python
dictionaries with a default argument. Therefore I created inner
functions like
IndexType j2lenget(IndexType i, IndexType def)
{
IndexType* result = i in j2len;
if (result)
return *result;
else
return def;
}
Probably this can be done more elegantly, but I personally think that
get() should be a standard method of AAs.
- The class used only two types of tuples which had clear purposes, so
they were translated into structs without much harm.
> Also, I'm
> wondering if the D version brings any extra perks like better
> performance, or less/clearer code?
I have not yet done any benchmarks, but I just assume that D is much
faster.
The D code is a bit longer and IMHO a bit less readable than Python,
but I'm much more used to Python than D.
Michael
More information about the Digitalmars-d-announce
mailing list