Is it possible to collect object usage information during compilation?
DaveG via Digitalmars-d
digitalmars-d at puremagic.com
Sat Jan 10 12:53:46 PST 2015
On Saturday, 10 January 2015 at 18:31:18 UTC, Paolo Invernizzi
wrote:
> On Saturday, 10 January 2015 at 17:31:42 UTC, DaveG wrote:
>> On Saturday, 10 January 2015 at 13:19:19 UTC, Martin Nowak
>> wrote:
>>> Here is a sketch for an optimal solution. I'm actually
>>> eagerly waiting that someone finally implements it.
>>>
>>> http://dpaste.dzfl.pl/cd375ac594cf
>>
>> I would also have to sell the idea of writing an ORM which is
>> certainly not on the roadmap, but this will certainly help my
>> argument.
>
> Maybe not, something simpler than a full ORM should be
> compelling also.
>
> I guess you know about the ORM Vietnam [1], but also this [2]
> can be of some help in selling a simple D solution.
>
> I would like to see, someday, something in D that:
>
> - can check at compile time the syntax of SQL;
> - can check at compile time the SQL query statement against
> the current DB schema;
> - can read the output of a DB schema dump at CT, and parse it
> into what is needed for the previous points (more complicated);
>
> The first point should be easy today, the second and the last
> one involve more work...
>
> [1]
> http://blogs.tedneward.com/2006/06/26/The+Vietnam+Of+Computer+Science.aspx
> [2] http://wozniak.ca/what-orms-have-taught-me-just-learn-sql
> ---
> Paolo
I have no intention of writing anything as massive as Entity
Framework or Hibernate. We have been successful over the past 4
years with just a small collection of functions to reduce some of
the pain (and redundancy) in writing a lot of dynamic SQL. Now
that we have an opportunity to start fresh we have a chance to do
something better.
The traditional problems with ORMs in general are well known and
these are the reasons why I have never used one in production.
1. Complexity. You basically need to learn an entire new language
(sometimes literally). This is an investment which can be worth
it if the abstraction is successful. The following problems are
why I think the investment is not worth it.
2. Limitations. Unfortunately too often you need to drop in to
SQL to really get things done. This alone is a non-starter. If I
need to bypass the abstraction to do anything really interesting
or complex, it has failed. Sometimes (usually) this is for
performance, other times it's because there is simply no way (or
it's too complicated) to express what I want through the
abstraction.
3. Compilation/Translation. The time to translate commands to SQL
(or whatever backend) can be a high price. Most ORMs do some type
of caching now which is generally sufficient. In D most of the
work can be done at compile time which is even better.
4. Unnecessary Data. Greedy data retrieval is way to common, the
default is usually to get everything. For small queries and data
sets you can write it off as "not a problem", but when your model
gets large and interconnects, this can be catastrophic. Again,
thanks Martin for the clever basis for a solution in D.
5. DB Performance. The efficiency of the SQL that is actually
generated. People seem to focus on this because the generated SQL
is generally quite verbose. Interestingly, in my experience, this
is often the smallest performance problem because the query
optimizer (at least in SQL Server with good indexes and
statistics) will generate the same execution plan regardless.
This is also a code gen problem that can be tweaked without
breaking user code.
You may have noticed that 4 of 5 problems are about performance.
That's because, at least in our case, it is that important and it
is that much of a problem. Current ORMs often look great, but in
my experience, the price is always to high. Some "micro-ORMs"
avoid the performance problems, but they do so by sacrificing
most of the features (you still have to write raw SQL for
example). Some of the problems are inherit to solution and cannot
be "solved", but they can be reduced.
For a long time I thought some of these problems where
fundamental and had basically written off the concept of ORMs
[see: Vietnam of Computer Science]. The good news is most of the
problems appear to be solvable.
#1 is unavoidable obviously there will be something new (whether
it's a DSL or just an API)
#2 is really dependent on the other problems and implementation.
#3 is "just" implementation.
#4 has a conceptual solution, now it's "just" implementation.
#5 does not have a solution because it will depend on the
backend, but I think it's reasonable to expect a solution that
works for almost all cases. It will be impossible to know without
testing.
One final note. You may have noticed I didn't mention the schema
syncing problem (keeping database and code in sync). There was a
time I would have said that was essential and while it would be
nice in a perfect world, I'm comfortable keeping them in sync
manually (or semi-manual with scripts). I can generate a bunch of
classes from an existing database fairly easily and when I change
a table I can manually update a class. If I was writing SQL
directly I would have to update my query, this is really no
different. Doing validation in unit tests is perfectly acceptable
to me.
Sorry for long post.
-Dave
More information about the Digitalmars-d
mailing list