Proposal: Database Engine for D

Andrei Alexandrescu via Digitalmars-d digitalmars-d at puremagic.com
Sun Jan 3 15:50:00 PST 2016


On 1/2/16 3:47 PM, Chris Wright wrote:
> On Sat, 02 Jan 2016 16:40:16 +0000, Piotrek wrote:
>
>> On Friday, 1 January 2016 at 10:00:43 UTC, Kapps wrote:
>>
>>> This example shows the difficulty of doing this in D. You can't really
>>> have something like `p.Name == "James"`, or `p.Age < 21`
>>> translate to SQL properly without language changes, which I believe
>>> Walter or Andrei were against. This has been the key problem when
>>> things like Linq to Sql for D have been brought up before.
>>
>> Not really. There is no translation stage to SQL or any other DSL in the
>> proposal. So this problem doesn't exist and no language changes are
>> needed.
>
> So you want to create the following query:
>
>    people.filter!(x => x.surname == "Slughorn");
>
> And you've got ten million people in the collection, and you want this
> query to finish soonish. So you need to use an index. But a full index
> scan isn't so great; you want to do an index lookup if possible.
>
> That's simple enough; we generate proxy types to record what properties
> you're using and what operations you're performing. PersonProxy records
> that you're accessing a field 'surname', gives a StringFieldProxy, and
> that records that you're checking for equality with the string "Slughorn".
> The lambda returns true when opEquals returns true.
>
> But people write queries that are more complex than that, like:
>
>    people.filter!(x => x.surname == "Slughorn" || x.age <= 17);
>
> First time you run this, x.surname.opEquals("Slughorn") returns true and
> the expression as a whole returns true. You missed the second part of the
> expression. That's bad.
>
> So we need to evaluate this lambda twice per parameter. (Actually, thanks
> to opCmp, sometimes you'll have to evaluate it three times.) We use that
> to build up a giant truth table, and then we can execute your query.
>
> And that "twice per parameter" thing is exponential, and we build up a
> truth table that's exponentially large with respect to the complexity of
> the query. Some queries I've written for production systems would take a
> week for this system to prepare to execute and require a petabyte of
> storage space.
>
> This is, shall we say, less than ideal.

This may in fact be good signal that an approach based on expression 
templates is not the most appropriate for D. -- Andrei



More information about the Digitalmars-d mailing list