Phobos Proposal: replace std.xml with kxml.
Andrei Alexandrescu
SeeWebsiteForEmail at erdani.org
Tue May 4 11:56:31 PDT 2010
Graham Fawcett wrote:
> On Tue, 04 May 2010 09:09:29 -0700, Andrei Alexandrescu wrote:
>
>> Graham Fawcett wrote:
>>> On Mon, 03 May 2010 16:01:30 -0700, Andrei Alexandrescu wrote:
>>>
>>>> Graham Fawcett wrote:
>>>>> The fact that libxml2/libxslt support not only XML parsing and DOM
>>>>> building, but also XSLT, XPath, XPointer, XInclude, RelaxNG, etc.,
>>>>> means that any homegrown library will be hard-pressed to cover the
>>>>> same range of tools and features.
>>>>>
>>>>> There are too many half-baked XML libraries in the world. No
>>>>> disrespect intended to opticron or anyone else; it just doesn't make
>>>>> a lot of sense to reinvent such a complex wheel (and believing that
>>>>> XML processing isn't complex is a sure sign that your homegrown
>>>>> library's design is incomplete!).
>>>>>
>>>>> Graham
>>>> I think what we need for the standard library is to take a solid XML
>>>> library licensed generously and adapt it to work with arbitrary
>>>> ranges.
>>> By "adapt" do you mean writing a wrapper for an existing library, or
>>> translating the source code of the library into D?
>>>
>>> What constitutes a "generous license" in this context? (For what it's
>>> worth, libxml2 is under the MIT License.)
>>>
>>> Graham
>> We'd need to modify the code. I haven't looked into available xml
>> libraries so I don't know which would be eligible.
>
> I think I understand your motivations: this is standard library, and
> so you want to minimize dependencies. But from a maintenance
> perspective, it seems a bad idea to translate a complex library into D
> code that few people will actively maintain -- whereas writing a
> wrapper (and introducing a library dependency) would keep the codebase
> small, let you share maintenance costs with the third-party library's
> developers, and (arguably) increase the stability and quality of the
> stdlib?
>
> I am not pushing for libxml2 as The Answer. I'm just questioning the
> motivation to translate other people's code to D, when the D platform
> excels at library integration. (Although I agree with your suggestion
> to borrow inspiration/code from Boost for datetime and other features;
> that's different, since Boost cannot feasibly be wrapped.)
>
> Best,
> Graham
My concern is purely technical - a library we just link to would force a
number of choices, such as input representation (e.g. arrays of char).
Ideally we should be able to change the library to accept any compatible
range of any compatible characters.
As a simple example, consider std.algorithm.levenshteinDistance. There
are plenty of good implementations and initially I just wrote one almost
identical to the Web lore. However, later I needed to compute
Levenshtein distances between strings stored in lists (tries, actually).
Well that doesn't work because the implementation at that time used
random access s[i] and t[i] all over the place. But it wasn't difficult
to change the algorithm to work with forward ranges. So now we have one
of the few Levenshtein distance implementations that work with other
inputs than arrays. In particular, we work correctly with UTF inputs
without needing to copy the input, something that I haven't seen
anywhere else. If you google for ``levenshtein utf'' Google will even
think the query has a typo. Search results include an OCaml
implementation that copies the input
(http://en.wikibooks.org/wiki/Algorithm_Implementation/Strings/Levenshtein_distance#OCaml)
and a Ruby implementation that also copies the input
(http://rubyforge.org/frs/?group_id=2080&release_id=7389). By using the
range abstraction, we get to support UTF Levenshtein without significant
additional implementation effort - the code is very similar to the one
using indices throughout.
Andrei
More information about the Digitalmars-d
mailing list