Compile Time vs Run Time

Fri Jan 12 11:01:25 UTC 2024

On Friday, 12 January 2024 at 06:06:52 UTC, Walter Bright wrote:
> On 1/9/2024 3:49 PM, Paolo Invernizzi wrote:
>> You are underestimating what can be gained as value in 
>> catching SQL problems at compile time instead of runtime. And, 
>> believe me, it's not a matter of mocking the DB and relying on 
>> unittest and coverage.
>
> Please expand on that. This is a very important topic. I want 
> to know all the relevant facts.

As a preamble, we are _currently_ doing all the SQL validations 
against schemas at compile time: semantic of the query, 
correctness of the relations involved, types matching with D (and 
Elm types), permission granted to roles that are performing the 
query.

That's not a problem at all, it's just something like:

    sql!`select foo from bar where baz > 1` [1]

In the same way we check also this:

   sql!`update foo set bag = ${d_variable_bag}`

But to attach sanitise functionalities in what is inside 
`d_variable_bag`, checking its type, and actually bind the 
content for the sql protocol is done by mixins, after the 
sql!string instantiation. As you can guess, that is the most 
common usage, by far, the business logic is FULL of stuff like 
that.

The security aspect is related to the fact that you _always_ need 
to sanitise the data content of the d variable, the mixin takes 
care of that part, and you can't skip it.

Said that, unittesting at runtime can be done against a real db, 
or mocking it.

A real db is onerous, sometime you need additional licenses, 
resource management, and it's time consuming. Just imagine 
writing D code, but having back errors not during compilations 
but only when the "autotester" CI task completed!

Keep in mind that using a real db is a very common, for one 
simple reason: mocking a db to be point of being useful for unit 
testing is a PITA. The common approach is simply skipping that, 
and mock the _results_ of the data retrieved by the query, to 
unittest the business logic. The queries are not checked until 
they run agains the dev db.

The compile time solutions instead, give you immediately feedback 
on wrong query, wrong type bindings, and that's invaluable 
especially regarding a fundamental things: refactory of code, or 
schema changes.

If the DB schema is changed, the application simply does not 
compile anymore, until you align it again against the changed 
schema. And the compiler gently points you to the pieces of code 
you need to adjust, and the same if you change a D type that 
somewhere will be bond to a sql parameters. So you can refactor 
without fears, and if the application compiles, you are assured 
to have everything aligned.

It's like extending the correctness of  type system down to the 
db type system, and it's priceless.

So, long story short: we will be forced to use mixin if we can't 
rely on CT interpolation, but having it will simplify the 
codebase.

[1] well, query sometimes can be things like that:

     with
         dsx as (select face_id, bounding_box_px, gaze_yaw_deg, 
gaze_pitch_deg from dev_eyes where eye = ${sx}),
         ddx as (select face_id, bounding_box_px, gaze_yaw_deg, 
gaze_pitch_deg from dev_eyes where eye = ${dx})
     select
         dfc.bounding_box_px as face, dfc.expression, 
dby.center_z_mm,
         dsx.bounding_box_px as eye_sx, dsx.gaze_pitch_deg, 
dsx.gaze_yaw_deg,
         ddx.bounding_box_px as eye_dx, ddx.gaze_pitch_deg, 
ddx.gaze_yaw_deg
     from dev_samples
         left join dev_bodies as dby using(sample_id)
         left join dev_faces as dfc using(body_id)
         left join dsx using(face_id)
         left join ddx using(face_id)
     where dev_samples.device_id = ${deviceId}
         and system_timestamp_ms = (select 
max(system_timestamp_ms) from dev_samples where 
dev_samples.device_id=${deviceId})
         and dfc.bounding_box_px is not null`
     order by dby.center_z_mm