[GSOC] Database API draft proposal

Mon Apr 4 04:50:38 PDT 2011

Fawzi Mohamed wrote:
>
> On 4-apr-11, at 02:01, Piotr Szturmaj wrote:
>
>> Fawzi Mohamed wrote:
>>> [...]
>>> I think that your responses are very relevant, as it seems to me that
>>> your work is nice, and I find that if a GSoC is done in that direction
>>> it should definitely work together with the good work that is already
>>> done, let's don't create multiple competing projects if people are
>>> willing to work together.
>>
>> I'm ready to cooperate :)
>
> great :)
>
>>>>> * support for static and dynamic types.
>>>>> how access of dynamic and static types differs, should be as little as
>>>>> possible, and definitely the access one uses for dynamic types should
>>>>> work without changes on static types
>>>>
>>>> If you mean statically or dynamically typed data row then I can say my
>>>> DBRow support both.
>>>
>>> yes but as I said I find the support for dynamic data rows weak.
>>
>> I've just added row["column"] bracket syntax for dynamic rows.
>
> excellent, ideally that should work also for untyped, because one wants
> to be able to switch to a typed Row without needing to change its code

I used to think the same, but currently this is technically impossible. 
When I started working on this I wanted one common interface, but tuples 
use static indexing to their fields. You can't write such code:

Tuple!(int, string) t;

int index = 1;
// try access string field:
t[index] = "abc"; // error

but this works:
t[1] = "abc"; // ok

This problem also applies to structs (FieldTypeTuple). To overcome that 
we need to split opIndex to compile-time one and run-time one (add 
static opIndex).

 > (and it should work exactly the same, so the typed rows will need to
 > wrap things in Variants when using that interface).

Yes, I tried hard to do it. It worked, but it broke Tuple index access - 
it was hidden by opIndex.

>>>>> * class or struct for row object
>>>>
>>>> I'm using struct, because I think row received from database is a
>>>> value type rather than reference. If one selects rows from one table
>>>> then yes, it is possible to do some referencing based on primary key,
>>>> but anyway I think updates should be done explicitly, because row
>>>> could be deleted in the meantime. In more complex queries, not all of
>>>> selected rows are materialized, i.e. they may be from computed
>>>> columns, view columns, aggregate functions and so on. Allocation
>>>> overhead is also lower for structs.
>>>>
>>>>> * support for table specific classes?
>>>>
>>>> Table specific classes may be written by user and somehow wrap
>>>> underlying row type.
>>>
>>> well with the current approach it is ugly because your calls would be
>>> another type, thus either you remove all typing or you can't have
>>> generic functions, accepting rows, everything has to be a template,
>>> looping on a table or a row you always need a template.
>>>
>>
>> Could you elaborate? I don't know what do you mean.
>
> Well I am not totally sure either, having the row handle better the
> dynamic case i already a nice step forward, I still fear that we will
> have problems with the ORM level, I am not 100% sure, that is the reason
> I would like to try to flesh out the ORM level a bit more.
> I would likethat one can loop on all the tables and for each one get the
> either the generic or the specialized object depending on what is needed.
> If one wants to have business logic in the specialized object it should
> be difficult to bypass them.

Well, it should be possible right now:

struct MyData
{
     int a;
     int b;

     int multiply()
     {
         return a * b;
     }
}

auto cmd = new PGConnection(conn, "SELECT a, b FROM numbers")
auto result = cmd.executeQuery!MyData;

foreach (row; result)
     writeln(row.multiply);

 > Maybe I am asking too much and the ORM level should never expose the
 > rows directly, because if we use structs we cannot have a common type
 > representing a generic row of a DB which might be specialized or not
 > (without major hacking).

ORM level may of course expose rows. It should be an additional level of 
abstraction built on top of SQL api. So one can mix SQL and ORM interfaces.
In regards to common type, it's currently impossible to wrap a Tuple or 
struct and use [index] access to fields. No matter if we use struct or not.

>>>>> * reference to description of the table (to be able to get also
>>>>> dynamic
>>>>> types by column name, but avoid using too much memory for the
>>>>> structure)
>>>>
>>>> My PostgreSQL client already supports that. Class PGCommand has member
>>>> "fields", which contain information about returned columns. You can
>>>> even check what columns will be returned from a query, before actually
>>>> executing it.
>>>
>>> ok that is nice, and my point is that the type that the user sees by
>>> default should automatically take advantage of that
>>>
>>>>> * Nice to define table structure, and what happens if the db has
>>>>> another
>>>>> structure.
>>>>
>>>> This is a problem for ORM, but at first, we need standard query API.
>>>
>>> I am not so sure about this, yes these (also classes for tables) are
>>> part of the ORM, but the normal users will more often be at the ORM
>>> level I think, and how exactly we want the things look like that the
>>> object level can influence the choice of the best low level interface.
>>
>> A "defined" DBRow or static one, if used on result which has inequal
>> number of columns or their types aren't convertible to row fields then
>> it's an error. But, if someone uses a static fields, he should also
>> take care that the query result is consistent with those fields.
>
> For example doe we want lazy loading of an object from the db? if yes
> how we represent it with current Rows objects?

Could you post an example of lazy loading of an object?

>>>>> * you want to support only access or also db creation and
>>>>> modification?
>>>>
>>>> First, I'm preparing base "traditional" API. Then I want to write
>>>> simple object-relational mapping. I've already written some code that
>>>> generated CREATE TABLE for structs at compile time. Static typing of
>>>> row fields is very helpful here.
>>>
>>> Very good I think that working on getting the API right there and having
>>> it nice to use is important.
>>> Maybe you are right and the current DBRow is indeed the best
>>> abstraction, but I am not yet 100% sure, to me it looks like it isn't
>>> the best end user abstraction (but it might be an excellent low level
>>> object)
>>>
>>
>> I should state here, that end-user usability is very important to me.
>> I should also clarify that my code isn't completely finished and of
>> course it is a subject to change. Any suggestions and critics are
>> welcome :)
>
> very good :)
>